Phase 1: profiling timers for /api/upload-project by madara88645 · Pull Request #318 · madara88645/VibeGraph

madara88645 · 2026-05-11T09:11:39Z

Context

Two weeks ago commit d74265b ("Perf: content-hash AST cache for /api/upload-project #250") landed. The follow-up question was whether first-upload (cold-cache) latency on medium repos is still dominant.

Investigation result (2026-05-11): Zero open issues, zero performance complaints on GitHub since the cache merged. Production telemetry was not reachable from this run (no log access from a fresh session, as expected). Shipping the timers preemptively — they are gated behind ?profile=1 so they are completely free in normal operation. The data they collect will answer the latency question on the next user upload.

What this adds

?profile=1 on POST /api/upload-project returns a _profile dict alongside the normal response with time.perf_counter() resolution on every significant work unit:

Key	Measures
`walk_ms`	`os.scandir` directory traversal
`parse_count`	total files passed to `_parse_cached`
`parse_cached_hits`	how many were served from the AST cache
`parse_total_ms`	cumulative `_parse_cached` time across all files
`compose_all_ms`	`nx.compose_all()` graph merge
`scc_ms`	`nx.strongly_connected_components` (cycle detection)
`export_build_ms`	edge loop + `output_data` dict construction
`analyze_total_ms`	wall time for entire `analyze_file()` call
`export_total_ms`	wall time for entire `export_to_react_flow()` call

Instrumented files

analyst/analyzer.py — thread-local _PROFILE_DATA accumulator in _parse_cached; timing added to _analyze_directory (walk, parse pass, compose_all); analyze_file gains an optional _profile: dict | None = None kwarg.
analyst/exporter.py — _profile kwarg on export_to_react_flow; timers around the SCC and dict-build blocks.
app/routers/upload.py — reads ?profile=1, creates the collector dict, wraps both calls, returns JSONResponse({…, "_profile": profile_data}) only when profiling is enabled.

No new dependencies. Normal requests have zero overhead (all guards are if _profile is not None).

Sample profile — VibeGraph itself (cold cache, 68 Python files, 1174 nodes / 2646 edges)

{
  "walk_ms": 2.33,
  "parse_count": 68,
  "parse_cached_hits": 2,
  "parse_total_ms": 85.48,
  "compose_all_ms": 6.56,
  "scc_ms": 2.54,
  "export_build_ms": 2.71,
  "analyze_total_ms": 102.1,
  "export_total_ms": 5.9
}

Reading: parse_total_ms (85 ms) is ~83 % of the analyze_total_ms budget on a cold cache. walk, compose_all, and the export steps are all sub-7 ms and not worth optimising yet. This confirms that any further latency work should focus on reducing cold-parse time (e.g. persistent on-disk AST cache or parallel parsing), not on the graph or export stages.

Sample profile — synthetic 60-function file (cold cache)

{
  "walk_ms": 0.08,
  "parse_count": 1,
  "parse_cached_hits": 0,
  "parse_total_ms": 1.57,
  "compose_all_ms": 0.41,
  "scc_ms": 0.34,
  "export_build_ms": 0.06
}

Checklist

GROQ_API_KEY=dummy python3 -m pytest tests/ -q → 251 passed, 0 failed
ruff check analyst/analyzer.py analyst/exporter.py app/routers/upload.py → All checks passed
No changes to requirements.txt
No overhead on normal (non-?profile=1) requests

Generated by Claude Code

Adds ?profile=1 opt-in timing instrumentation so cold-cache latency on medium repos can be measured without adding any logging overhead to normal requests. All timers are guarded by the query param and never fire in production traffic. https://claude.ai/code/session_014rqwarDVK9fwzxWV2pzS2z

vercel · 2026-05-11T09:11:44Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vibegraph	Ready	Preview, Comment	May 12, 2026 7:49pm

Copilot

Pull request overview

Adds opt-in profiling instrumentation to the /api/upload-project pipeline so users (or support/debug sessions) can measure where time is spent on cold-cache uploads without impacting normal requests.

Changes:

Add ?profile=1 support in /api/upload-project, returning an additional _profile dict in the response.
Instrument analyzer stages (walk, parse pass, nx.compose_all) and accumulate parse timing/counts via a thread-local collector.
Instrument exporter stages (SCC/cycle detection and edge/output dict build) and plumb _profile through.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`app/routers/upload.py`	Adds `?profile=1` flag handling, end-to-end timers for analyze/export, and conditionally returns `_profile`.
`analyst/analyzer.py`	Adds thread-local profiling accumulator and directory-analysis timers; plumbs optional `_profile` kwarg.
`analyst/exporter.py`	Adds optional `_profile` kwarg and timings for SCC and export build steps.

Comments suppressed due to low confidence (1)

analyst/exporter.py:60

export_to_react_flow(..., _profile=...) adds new profiling outputs (scc_ms, export_build_ms) but there are no tests validating these keys are written when profiling is enabled. Since tests/test_export.py already covers this exporter, consider adding a small test passing a dict for _profile and asserting the expected keys are present and numeric.

        # Detect cycles
        # Optimization: Use strongly_connected_components O(V+E) instead of simple_cycles O((V+E)C)
        # An edge is part of a cycle if both endpoints belong to the same SCC of size > 1.
        cycle_edges = set()
        _t_scc = time.perf_counter() if _profile is not None else 0.0
        try:
            node_to_component = {}
            for i, component in enumerate(nx.strongly_connected_components(graph)):
                if len(component) > 1:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

    tree = ast.parse(source, filename=filename)
    with _AST_CACHE_LOCK:
        _AST_CACHE[key] = tree
        _AST_CACHE.move_to_end(key)
        if len(_AST_CACHE) > _AST_CACHE_MAX:


+        if profile_data is not None:
+            return JSONResponse(content={**response_data, "_profile": profile_data})
        return response_data


+        profile_mode = request.query_params.get("profile") == "1"
+        profile_data: dict | None = {} if profile_mode else None
+        _t0 = time.perf_counter() if profile_data is not None else 0.0
+        result = CodeAnalyzer().analyze_file(tmp_dir, _profile=profile_data)
+        if profile_data is not None:
+            profile_data["analyze_total_ms"] = round((time.perf_counter() - _t0) * 1000, 2)


@@ -445,7 +462,7 @@ def analyze_file(self, target_path: str) -> dict[str, Any]:
            return {"error": f"Path not found: {safe_path}"}

        if os.path.isdir(target_path):
-            return self._analyze_directory(target_path)
+            return self._analyze_directory(target_path, _profile=_profile)
        else:


… 318)

Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI review requested due to automatic review settings May 11, 2026 09:11

Copilot started reviewing on behalf of madara88645 May 11, 2026 09:12 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

madara88645 and others added 2 commits May 12, 2026 20:46

Merge origin/main into phase1/profiling-timers-upload (resolve for PR…

f678fe9

… 318)

style: apply ruff format to profiling upload path (PR 318)

76717a2

Co-authored-by: Cursor <cursoragent@cursor.com>

vercel Bot deployed to Preview May 12, 2026 19:49 View deployment

madara88645 merged commit 371072a into main May 12, 2026
8 checks passed

madara88645 deleted the phase1/profiling-timers-upload branch May 12, 2026 19:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 1: profiling timers for /api/upload-project#318

Phase 1: profiling timers for /api/upload-project#318
madara88645 merged 3 commits into
mainfrom
phase1/profiling-timers-upload

madara88645 commented May 11, 2026

Uh oh!

vercel Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

madara88645 commented May 11, 2026

Context

What this adds

Instrumented files

Sample profile — VibeGraph itself (cold cache, 68 Python files, 1174 nodes / 2646 edges)

Sample profile — synthetic 60-function file (cold cache)

Checklist

Uh oh!

vercel Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented May 11, 2026 •

edited

Loading