Skip to content

Phase 1: profiling timers for /api/upload-project#318

Merged
madara88645 merged 3 commits into
mainfrom
phase1/profiling-timers-upload
May 12, 2026
Merged

Phase 1: profiling timers for /api/upload-project#318
madara88645 merged 3 commits into
mainfrom
phase1/profiling-timers-upload

Conversation

@madara88645
Copy link
Copy Markdown
Owner

Context

Two weeks ago commit d74265b ("Perf: content-hash AST cache for /api/upload-project #250") landed. The follow-up question was whether first-upload (cold-cache) latency on medium repos is still dominant.

Investigation result (2026-05-11): Zero open issues, zero performance complaints on GitHub since the cache merged. Production telemetry was not reachable from this run (no log access from a fresh session, as expected). Shipping the timers preemptively — they are gated behind ?profile=1 so they are completely free in normal operation. The data they collect will answer the latency question on the next user upload.


What this adds

?profile=1 on POST /api/upload-project returns a _profile dict alongside the normal response with time.perf_counter() resolution on every significant work unit:

Key Measures
walk_ms os.scandir directory traversal
parse_count total files passed to _parse_cached
parse_cached_hits how many were served from the AST cache
parse_total_ms cumulative _parse_cached time across all files
compose_all_ms nx.compose_all() graph merge
scc_ms nx.strongly_connected_components (cycle detection)
export_build_ms edge loop + output_data dict construction
analyze_total_ms wall time for entire analyze_file() call
export_total_ms wall time for entire export_to_react_flow() call

Instrumented files

  • analyst/analyzer.py — thread-local _PROFILE_DATA accumulator in _parse_cached; timing added to _analyze_directory (walk, parse pass, compose_all); analyze_file gains an optional _profile: dict | None = None kwarg.
  • analyst/exporter.py_profile kwarg on export_to_react_flow; timers around the SCC and dict-build blocks.
  • app/routers/upload.py — reads ?profile=1, creates the collector dict, wraps both calls, returns JSONResponse({…, "_profile": profile_data}) only when profiling is enabled.

No new dependencies. Normal requests have zero overhead (all guards are if _profile is not None).

Sample profile — VibeGraph itself (cold cache, 68 Python files, 1174 nodes / 2646 edges)

{
  "walk_ms": 2.33,
  "parse_count": 68,
  "parse_cached_hits": 2,
  "parse_total_ms": 85.48,
  "compose_all_ms": 6.56,
  "scc_ms": 2.54,
  "export_build_ms": 2.71,
  "analyze_total_ms": 102.1,
  "export_total_ms": 5.9
}

Reading: parse_total_ms (85 ms) is ~83 % of the analyze_total_ms budget on a cold cache. walk, compose_all, and the export steps are all sub-7 ms and not worth optimising yet. This confirms that any further latency work should focus on reducing cold-parse time (e.g. persistent on-disk AST cache or parallel parsing), not on the graph or export stages.

Sample profile — synthetic 60-function file (cold cache)

{
  "walk_ms": 0.08,
  "parse_count": 1,
  "parse_cached_hits": 0,
  "parse_total_ms": 1.57,
  "compose_all_ms": 0.41,
  "scc_ms": 0.34,
  "export_build_ms": 0.06
}

Checklist

  • GROQ_API_KEY=dummy python3 -m pytest tests/ -q251 passed, 0 failed
  • ruff check analyst/analyzer.py analyst/exporter.py app/routers/upload.pyAll checks passed
  • No changes to requirements.txt
  • No overhead on normal (non-?profile=1) requests

Generated by Claude Code

Adds ?profile=1 opt-in timing instrumentation so cold-cache latency on
medium repos can be measured without adding any logging overhead to
normal requests. All timers are guarded by the query param and never
fire in production traffic.

https://claude.ai/code/session_014rqwarDVK9fwzxWV2pzS2z
Copilot AI review requested due to automatic review settings May 11, 2026 09:11
@vercel
Copy link
Copy Markdown

vercel Bot commented May 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vibegraph Ready Ready Preview, Comment May 12, 2026 7:49pm

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds opt-in profiling instrumentation to the /api/upload-project pipeline so users (or support/debug sessions) can measure where time is spent on cold-cache uploads without impacting normal requests.

Changes:

  • Add ?profile=1 support in /api/upload-project, returning an additional _profile dict in the response.
  • Instrument analyzer stages (walk, parse pass, nx.compose_all) and accumulate parse timing/counts via a thread-local collector.
  • Instrument exporter stages (SCC/cycle detection and edge/output dict build) and plumb _profile through.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
app/routers/upload.py Adds ?profile=1 flag handling, end-to-end timers for analyze/export, and conditionally returns _profile.
analyst/analyzer.py Adds thread-local profiling accumulator and directory-analysis timers; plumbs optional _profile kwarg.
analyst/exporter.py Adds optional _profile kwarg and timings for SCC and export build steps.
Comments suppressed due to low confidence (1)

analyst/exporter.py:60

  • export_to_react_flow(..., _profile=...) adds new profiling outputs (scc_ms, export_build_ms) but there are no tests validating these keys are written when profiling is enabled. Since tests/test_export.py already covers this exporter, consider adding a small test passing a dict for _profile and asserting the expected keys are present and numeric.
        # Detect cycles
        # Optimization: Use strongly_connected_components O(V+E) instead of simple_cycles O((V+E)C)
        # An edge is part of a cycle if both endpoints belong to the same SCC of size > 1.
        cycle_edges = set()
        _t_scc = time.perf_counter() if _profile is not None else 0.0
        try:
            node_to_component = {}
            for i, component in enumerate(nx.strongly_connected_components(graph)):
                if len(component) > 1:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread analyst/analyzer.py
Comment on lines 156 to 160
tree = ast.parse(source, filename=filename)
with _AST_CACHE_LOCK:
_AST_CACHE[key] = tree
_AST_CACHE.move_to_end(key)
if len(_AST_CACHE) > _AST_CACHE_MAX:
Comment thread app/routers/upload.py
Comment on lines +276 to 278
if profile_data is not None:
return JSONResponse(content={**response_data, "_profile": profile_data})
return response_data
Comment thread app/routers/upload.py Outdated
Comment on lines +226 to +231
profile_mode = request.query_params.get("profile") == "1"
profile_data: dict | None = {} if profile_mode else None
_t0 = time.perf_counter() if profile_data is not None else 0.0
result = CodeAnalyzer().analyze_file(tmp_dir, _profile=profile_data)
if profile_data is not None:
profile_data["analyze_total_ms"] = round((time.perf_counter() - _t0) * 1000, 2)
Comment thread analyst/analyzer.py Outdated
Comment on lines 451 to 466
@@ -445,7 +462,7 @@ def analyze_file(self, target_path: str) -> dict[str, Any]:
return {"error": f"Path not found: {safe_path}"}

if os.path.isdir(target_path):
return self._analyze_directory(target_path)
return self._analyze_directory(target_path, _profile=_profile)
else:
@madara88645 madara88645 merged commit 371072a into main May 12, 2026
8 checks passed
@madara88645 madara88645 deleted the phase1/profiling-timers-upload branch May 12, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants