Skip to content

Commit 3c465a5

Browse files
authored
Merge pull request #159 from rostilos/1.5.3-rc
feat: Enhance branch issue reconciliation and file snapshot retrieval…
2 parents cac38cb + b44391f commit 3c465a5

3 files changed

Lines changed: 147 additions & 19 deletions

File tree

README.md

Lines changed: 97 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,25 +4,110 @@
44

55
## Capabilities by Platform
66

7-
CodeCrow supports multiple version control systems with varying levels of integration. Below is the current feature matrix:
8-
9-
| Feature | Bitbucket | GitHub | GitLab |
10-
| :--------------------- | :-------: | :----: | :----: |
11-
| PR Analysis | + | + | + |
12-
| Branch Analysis | + | + | + |
13-
| Task Context Retrieval | - | - | - |
14-
| /ask | + | + | + |
15-
| /analyze | + | + | + |
16-
| /summarize | + | + | + |
17-
| Continuous Analysis | + | + | + |
18-
| RAG Pipeline | + | + | + |
7+
CodeCrow supports multiple version control systems. The AI analysis engine is the same across all platforms — the differences are in how results are surfaced in each VCS.
8+
9+
### Analysis & Review
10+
11+
| Feature | Bitbucket | GitHub | GitLab |
12+
| :----------------------- | :-------: | :----: | :----: |
13+
| PR / MR Analysis ||||
14+
| Branch Analysis (push) ||||
15+
| Continuous Analysis ||||
16+
| Incremental / Delta Diff ||||
17+
| RAG-Augmented Review ||||
18+
19+
### PR / MR Comment Integration
20+
21+
| Feature | Bitbucket | GitHub | GitLab |
22+
| :--------------------------------- | :---------------: | :----: | :----: |
23+
| PR Summary Comment ||||
24+
| Inline Diff Comments | via Code Insights |||
25+
| Code Insights Report + Annotations ||||
26+
| Check Runs ||||
27+
| Threaded Comment Replies ||||
28+
| Placeholder While Analyzing ||||
29+
30+
### Slash Commands (in PR comments)
31+
32+
| Command | Bitbucket | GitHub | GitLab |
33+
| :---------------- | :-------: | :----: | :----: |
34+
| `/ask <question>` ||||
35+
| `/analyze` ||||
36+
| `/summarize` ||||
37+
38+
### Dashboard & Issue Management
39+
40+
These features are platform-independent and available through the CodeCrow web UI.
41+
42+
| Feature | Description |
43+
| :-------------------------- | :----------------------------------------------------------------------------- |
44+
| Issue Tracker | Per-branch and per-PR issue lists with severity, category, and status filters |
45+
| Issue Lifecycle | Automatic resolution tracking across analyses; manual resolve/reopen |
46+
| Source Context Viewer | Full source code browser with inline issue annotations for every analyzed file |
47+
| Git Graph | Visual commit history with per-commit analysis status and branch health |
48+
| Quality Gates | Configurable pass/fail thresholds per workspace |
49+
| Custom Rules | Per-project enforce/suppress rules with glob-based file patterns |
50+
| Project Analytics | Aggregated severity breakdown, analysis history, and branch health |
51+
| AI Model Selection | Choose your LLM provider and model (OpenRouter, Anthropic, Google, OpenAI) |
52+
| Workspace & Team Management | Roles (Owner, Admin, Member, Viewer), member invites, ownership transfer |
53+
| Two-Factor Authentication | TOTP-based 2FA for sensitive operations |
54+
55+
### Setup Methods
56+
57+
| Method | Bitbucket | GitHub | GitLab |
58+
| :----------------- | :----------: | :-------------: | :----: |
59+
| Native App Install | ✅ (Connect) | ✅ (GitHub App) ||
60+
| Manual Webhook ||||
61+
| CI Pipeline Action ||||
62+
63+
---
64+
65+
## Supported Languages
66+
67+
CodeCrow's AI review is **language-agnostic** — it analyzes any language or framework the underlying LLM can understand. No special configuration is required.
68+
69+
The RAG pipeline (codebase indexing for context-aware reviews) provides enhanced support for languages with dedicated AST parsers. All other text-based files are indexed using a generic chunker.
70+
71+
| Language | AI Review | RAG (AST) | Notes |
72+
| :----------------------- | :-------: | :-------: | :------------------------------------------------ |
73+
| Java ||| incl. Spring, Jakarta EE, Android |
74+
| Kotlin ||| incl. Android, Ktor |
75+
| Python ||| incl. Django, Flask, FastAPI |
76+
| JavaScript ||| incl. React, Vue, Svelte, Node.js |
77+
| TypeScript ||| incl. Angular, Next.js, Deno |
78+
| Go ||| |
79+
| Rust ||| |
80+
| C ||| |
81+
| C++ ||| |
82+
| C# ||| incl. .NET, ASP.NET, Unity |
83+
| PHP ||| incl. Laravel, Symfony |
84+
| Ruby ||| incl. Rails |
85+
| Swift ||| incl. iOS / macOS |
86+
| Scala ||| |
87+
| Lua ||| |
88+
| Perl ||| |
89+
| Haskell ||| |
90+
| COBOL ||| |
91+
| Objective-C ||| |
92+
| Bash / Shell ||| |
93+
| SQL ||| |
94+
| R ||| |
95+
| HTML / CSS / SCSS ||| |
96+
| Vue / Svelte SFCs ||| |
97+
| YAML / TOML / JSON / XML ||| config files, IaC |
98+
| Markdown / RST ||| documentation |
99+
| _Any other language_ || generic | LLM-dependent; no AST, uses text chunking for RAG |
100+
101+
> **Framework-specific?** The review quality scales with the LLM's knowledge of the framework. Popular frameworks (React, Spring Boot, Django, Rails, Laravel, .NET, etc.) get high-quality, idiomatic feedback out of the box. Niche frameworks work too — the LLM simply has less training data to draw on.
19102
20103
## Key Features
21104

22105
- **Context-Aware Reviews**: Powered by a custom RAG (Retrieval-Augmented Generation) pipeline using Qdrant vector storage.
23106
- **Incremental Analysis**: Only scans changed code to keep feedback fast and cost-efficient.
24107
- **Multi-Tenant Architecture**: Securely manage multiple teams and projects from a single dashboard.
25108
- **Interactive Commands**: Command CodeCrow directly from PR comments using `/ask`, `/analyze`, and `/summarize`.
109+
- **Issue Lifecycle**: Automatic tracking of resolved vs. open issues across analyses with deterministic and AI-based reconciliation.
110+
- **Bring Your Own Model**: Connect your preferred LLM provider — OpenRouter, Anthropic, Google, or OpenAI.
26111

27112
## Documentation
28113

java-ecosystem/libs/analysis-engine/src/main/java/org/rostilos/codecrow/analysisengine/service/branch/BranchIssueReconciliationService.java

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -279,6 +279,11 @@ public int sweepDeterministicResolutions(
279279

280280
int resolvedCount = 0;
281281

282+
// Collect file contents fetched during sweep for branch-level snapshot backfill.
283+
// This progressively fills the Source Context tab with all files that have issues,
284+
// even if those files were never in a diff scope.
285+
Map<String, String> fetchedFileContents = new LinkedHashMap<>();
286+
282287
for (Map.Entry<String, List<BranchIssue>> entry : issuesByFile.entrySet()) {
283288
String filePath = entry.getKey();
284289
List<BranchIssue> fileIssues = entry.getValue();
@@ -303,6 +308,7 @@ public int sweepDeterministicResolutions(
303308
continue;
304309
}
305310
currentHashes = LineHashSequence.from(fileContent);
311+
fetchedFileContents.put(filePath, fileContent);
306312
} catch (Exception e) {
307313
log.debug("Sweep: skipping file {} (fetch failed: {})", filePath, e.getMessage());
308314
continue; // Don't resolve on error — leave for next run
@@ -357,6 +363,24 @@ public int sweepDeterministicResolutions(
357363
}
358364
}
359365

366+
// ── Backfill branch-level snapshots for non-diff files ────────────
367+
// The sweep already fetched content for these files; persisting them
368+
// as branch-level snapshots ensures they appear in the Source Context
369+
// tab alongside files from the normal diff-based analysis scope.
370+
if (!fetchedFileContents.isEmpty()) {
371+
try {
372+
int backfilled = fileSnapshotService.persistSnapshotsForBranch(
373+
branch, fetchedFileContents, request.getCommitHash());
374+
if (backfilled > 0) {
375+
log.info("Backfilled {} branch-level file snapshots from sweep-fetched content (Branch: {})",
376+
backfilled, request.getTargetBranchName());
377+
}
378+
} catch (Exception e) {
379+
log.warn("Failed to backfill branch snapshots from sweep (non-critical): {}",
380+
e.getMessage());
381+
}
382+
}
383+
360384
if (resolvedCount > 0) {
361385
log.info("Deterministic sweep resolved {} stale issues across {} non-diff files (Branch: {})",
362386
resolvedCount, issuesByFile.size(), request.getTargetBranchName());

java-ecosystem/libs/file-content/src/main/java/org/rostilos/codecrow/filecontent/service/FileSnapshotService.java

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@
1616
import java.nio.charset.StandardCharsets;
1717
import java.security.MessageDigest;
1818
import java.security.NoSuchAlgorithmException;
19+
import java.util.ArrayList;
20+
import java.util.LinkedHashMap;
1921
import java.util.List;
2022
import java.util.Map;
2123
import java.util.Optional;
@@ -468,21 +470,38 @@ public List<AnalyzedFileSnapshot> getSnapshotsForPr(Long pullRequestId) {
468470
// ── Branch-level aggregated retrieval ────────────────────────────────
469471

470472
/**
471-
* Get the latest file snapshots for a branch. Tries the direct branch_id FK first;
472-
* falls back to the legacy DISTINCT ON aggregation across analyses.
473+
* Get the latest file snapshots for a branch.
474+
* <p>
475+
* Merges two snapshot sources to ensure ALL ever-analysed files are visible:
476+
* <ol>
477+
* <li><b>Branch-level snapshots</b> (direct branch_id FK) — created by
478+
* {@link #persistSnapshotsForBranch} during each analysis run. These only
479+
* cover files that appeared in a diff scope.</li>
480+
* <li><b>Legacy analysis-level snapshots</b> (via analysis_id + DISTINCT ON) — cover
481+
* all files from prior analyses that used the older code path.</li>
482+
* </ol>
483+
* Branch-level snapshots take precedence when both exist for the same file path.
473484
* Returns metadata only (no content loaded).
474485
*/
475486
public List<AnalyzedFileSnapshot> getSnapshotsForBranch(Long projectId, String branchName) {
476-
// Try direct FK first
487+
Map<String, AnalyzedFileSnapshot> snapshotsByPath = new LinkedHashMap<>();
488+
489+
// 1. Branch-level snapshots (highest priority — latest content)
477490
Optional<Branch> branchOpt = branchRepository.findByProjectIdAndBranchName(projectId, branchName);
478491
if (branchOpt.isPresent()) {
479492
List<AnalyzedFileSnapshot> direct = snapshotRepository.findByBranchId(branchOpt.get().getId());
480-
if (!direct.isEmpty()) {
481-
return direct;
493+
for (AnalyzedFileSnapshot s : direct) {
494+
snapshotsByPath.put(s.getFilePath(), s);
482495
}
483496
}
484-
// Legacy fallback
485-
return snapshotRepository.findLatestSnapshotsByBranch(projectId, branchName);
497+
498+
// 2. Legacy analysis-level snapshots (fill gaps for files not yet in branch FK)
499+
List<AnalyzedFileSnapshot> legacy = snapshotRepository.findLatestSnapshotsByBranch(projectId, branchName);
500+
for (AnalyzedFileSnapshot s : legacy) {
501+
snapshotsByPath.putIfAbsent(s.getFilePath(), s);
502+
}
503+
504+
return new ArrayList<>(snapshotsByPath.values());
486505
}
487506

488507
/**

0 commit comments

Comments
 (0)