Skip to content

Commit f9762d8

Browse files
committed
feat: Mermaid PR architecture-diff action (level-1 default, nested opt-in)
Replaces the webview/Playwright PNG approach with an inline Mermaid diagram that GitHub renders natively in the PR comment — no image, no orphan branch, no contents:write, and fork-friendly. How it works: - Resolve a base ("before") analysis: use the committed .codeboarding/analysis.json at the PR base if present, else generate one via a full engine run on the base commit. - Analyze the PR head incrementally, seeded from the base (stable component ids), falling back to a full run on cache miss. - scripts/diff_to_mermaid.py diffs the two analyses (name-based matching; relation label change => modified) and emits a graph LR with nodes colored via classDef/class and arrows via positional linkStyle: green added, yellow modified, red dashed deleted. Escaping, deleted-namespace keying, and a size guard (GitHub's ~500-edge / 50k-char cap -> changed-only or text fallback). Rendering: - Level 1 (flat, top-level) is the default — readable inline, never trips the size cap. - nested: true draws depth>1 sub-components as subgraphs (leaf nodes filled, parent containers outlined). Optional --font-size/--node-padding/spacing emit an %%{init}%% directive to enlarge nodes. scripts/run_local.sh mirrors the action for local iteration (fast diff-only or full pipeline) and writes a browser HTML preview rendered with mermaid.js.
1 parent d7e3ba3 commit f9762d8

6 files changed

Lines changed: 1079 additions & 605 deletions

File tree

Lines changed: 14 additions & 118 deletions
Original file line numberDiff line numberDiff line change
@@ -1,126 +1,22 @@
1-
name: Example Usage of CodeBoarding Action
1+
name: Architecture diff
22

33
on:
4-
workflow_dispatch:
5-
inputs:
6-
repository_url:
7-
description: 'Repository URL to test with'
8-
required: false
9-
default: 'https://github.com/microsoft/markitdown'
10-
type: string
11-
source_branch:
12-
description: 'Source branch for comparison'
13-
required: false
14-
default: 'main'
15-
type: string
16-
target_branch:
17-
description: 'Target branch for comparison'
18-
required: false
19-
default: 'develop'
20-
type: string
21-
output_format:
22-
description: 'Output format for documentation'
23-
required: false
24-
default: '.md'
25-
type: choice
26-
options:
27-
- '.md'
28-
- '.rst'
29-
304
pull_request:
31-
branches: [ main, master ]
32-
types: [opened, synchronize, reopened]
33-
34-
schedule:
35-
# Run daily at 2 AM UTC
36-
- cron: '0 2 * * *'
5+
types: [opened, synchronize, reopened, ready_for_review]
6+
7+
# Only a PR comment is posted — no image is pushed — so contents:write is not needed.
8+
permissions:
9+
pull-requests: write
3710

3811
jobs:
39-
update-docs-action-usage:
12+
architecture-diff:
4013
runs-on: ubuntu-latest
41-
permissions:
42-
contents: write
43-
pull-requests: write
44-
14+
if: github.event.pull_request.draft == false
15+
timeout-minutes: 60
4516
steps:
46-
- name: Checkout repository
47-
uses: actions/checkout@v4
48-
with:
49-
token: ${{ secrets.GITHUB_TOKEN }}
50-
fetch-depth: 0 # Required to access branch history
51-
52-
# Determine branches based on context
53-
- name: Set branch variables
54-
id: set-branches
55-
run: |
56-
if [ "${{ github.event_name }}" = "pull_request" ]; then
57-
echo "source_branch=${{ github.head_ref }}" >> $GITHUB_OUTPUT
58-
echo "target_branch=${{ github.base_ref }}" >> $GITHUB_OUTPUT
59-
elif [ "${{ github.event.inputs.source_branch }}" != "" ] && [ "${{ github.event.inputs.target_branch }}" != "" ]; then
60-
echo "source_branch=${{ github.event.inputs.source_branch }}" >> $GITHUB_OUTPUT
61-
echo "target_branch=${{ github.event.inputs.target_branch }}" >> $GITHUB_OUTPUT
62-
else
63-
# Default to current branch and main
64-
echo "source_branch=${{ github.ref_name }}" >> $GITHUB_OUTPUT
65-
echo "target_branch=main" >> $GITHUB_OUTPUT
66-
fi
67-
68-
- name: Fetch CodeBoarding Documentation
69-
id: codeboarding
70-
uses: ./
71-
with:
72-
repository_url: ${{ github.event.inputs.repository_url }}
73-
source_branch: ${{ steps.set-branches.outputs.source_branch }}
74-
target_branch: ${{ steps.set-branches.outputs.target_branch }}
75-
output_directory: 'docs'
76-
output_format: ${{ github.event.inputs.output_format || '.md' }}
77-
78-
- name: Display Action Results
79-
run: |
80-
echo "Documentation files created: ${{ steps.codeboarding.outputs.markdown_files_created }}"
81-
echo "JSON files created: ${{ steps.codeboarding.outputs.json_files_created }}"
82-
echo "Documentation directory: ${{ steps.codeboarding.outputs.output_directory }}"
83-
echo "JSON directory: ${{ steps.codeboarding.outputs.json_directory }}"
84-
echo "Has changes: ${{ steps.codeboarding.outputs.has_changes }}"
85-
86-
# Check if we have any changes to commit
87-
- name: Check for changes
88-
id: git-changes
89-
run: |
90-
if [ -n "$(git status --porcelain)" ]; then
91-
echo "has_git_changes=true" >> $GITHUB_OUTPUT
92-
else
93-
echo "has_git_changes=false" >> $GITHUB_OUTPUT
94-
fi
95-
96-
- name: Create Pull Request
97-
if: steps.git-changes.outputs.has_git_changes == 'true' && steps.codeboarding.outputs.has_changes == 'true'
98-
uses: peter-evans/create-pull-request@v5
17+
- uses: codeboarding/codeboarding-action@v1
9918
with:
100-
token: ${{ secrets.GITHUB_TOKEN }}
101-
commit-message: "docs: update codeboarding documentation"
102-
title: "📚 CodeBoarding Documentation Update"
103-
body: |
104-
## 📚 Documentation Update
105-
106-
This PR contains updated documentation files fetched from the CodeBoarding service.
107-
108-
### 📊 Summary
109-
- **Documentation files created/updated**: ${{ steps.codeboarding.outputs.markdown_files_created }}
110-
- **JSON files created/updated**: ${{ steps.codeboarding.outputs.json_files_created }}
111-
- **Documentation directory**: `${{ steps.codeboarding.outputs.output_directory }}/`
112-
- **JSON directory**: `${{ steps.codeboarding.outputs.json_directory }}/`
113-
- **Source branch**: `${{ steps.set-branches.outputs.source_branch }}`
114-
- **Target branch**: `${{ steps.set-branches.outputs.target_branch }}`
115-
- **Output format**: `${{ github.event.inputs.output_format || '.md' }}`
116-
- **Repository analyzed**: `${{ steps.codeboarding.outputs.repo_url }}`
117-
118-
### 🔍 Changes
119-
Files have been updated with fresh documentation content based on code changes between branches.
120-
121-
---
122-
123-
🤖 This PR was automatically generated by the CodeBoarding documentation update workflow.
124-
branch: docs/codeboarding-update
125-
base: ${{ steps.set-branches.outputs.target_branch }}
126-
delete-branch: true
19+
llm_api_key: ${{ secrets.OPENROUTER_API_KEY }}
20+
# depth_level: '1' # 1-3, higher = more detail
21+
# diagram_direction: 'LR' # LR | TD | TB | RL | BT
22+
# changed_only: 'false' # 'true' to draw only changed components

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,9 @@
22
test_response.json
33
test_codeboarding/
44

5+
# Local test harness output (scripts/run_local.sh)
6+
.cb-local/
7+
58
# Environment files
69
.env
710

README.md

Lines changed: 98 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -1,111 +1,134 @@
11
<div align="center">
22
<img src="assets/icon.svg" alt="CodeBoarding Logo" height="150" />
3-
4-
# CodeBoarding [Diagram-First Documentation]
5-
6-
[![GitHub Action](https://img.shields.io/badge/GitHub-Action-blue?logo=github-actions)](https://github.com/marketplace/actions/codeboarding-diagram-first-documentation)
3+
4+
# CodeBoarding Architecture Diff (Mermaid)
5+
6+
Posts a PR comment with a **Mermaid** architecture diagram showing which components changed — **green** added, **yellow** modified, **red** deleted — for both nodes and arrows.
77
</div>
88

9-
Generates diagram-first visualizations of your codebase using static analysis and large language models.
9+
## What it does
10+
11+
On every pull request, this action:
12+
13+
1. Resolves a **base ("before") analysis**: it reads the `.codeboarding/analysis.json` committed at the PR base commit if one exists; otherwise it runs a full CodeBoarding analysis on the base commit to produce one.
14+
2. Runs an **incremental analysis on the PR head**, seeded from the base analysis — only LLM-calling the components whose code actually changed, so a typical PR costs a handful of LLM calls.
15+
3. **Diffs the two analyses** and renders the architecture graph as a Mermaid block with changed components and relations colored:
16+
- **green** — added
17+
- **yellow** — modified
18+
- **red** (dashed) — deleted
19+
4. Posts a sticky PR comment containing the Mermaid block. **GitHub renders the diagram inline** — no image, no Playwright, no extra branch.
1020

1121
## Usage
1222

1323
```yaml
14-
name: Generate Documentation
24+
name: Architecture diff
1525
on:
16-
push:
17-
branches: [ main ]
1826
pull_request:
19-
branches: [ main ]
20-
types: [opened, synchronize, reopened]
27+
types: [opened, synchronize, reopened, ready_for_review]
28+
29+
permissions:
30+
pull-requests: write # the only permission needed — nothing is pushed
2131

2232
jobs:
23-
documentation:
33+
diagram:
2434
runs-on: ubuntu-latest
35+
if: github.event.pull_request.draft == false
36+
timeout-minutes: 60
2537
steps:
26-
- name: Checkout
27-
uses: actions/checkout@v4
28-
with:
29-
fetch-depth: 0 # Required to access branch history
30-
31-
- name: Generate Documentation
32-
uses: codeboarding/codeboarding-ghaction@v1
38+
- uses: codeboarding/codeboarding-action@v1
3339
with:
34-
repository_url: ${{ github.server_url }}/${{ github.repository }}
35-
source_branch: ${{ github.head_ref || github.ref_name }}
36-
target_branch: ${{ github.base_ref || 'main' }}
37-
output_directory: 'docs'
38-
output_format: '.md'
39-
40-
- name: Upload Documentation
41-
uses: actions/upload-artifact@v4
42-
with:
43-
name: documentation
44-
path: |
45-
docs/
46-
.codeboarding/
40+
llm_api_key: ${{ secrets.OPENROUTER_API_KEY }}
4741
```
4842
43+
You need **one secret**: an LLM API key. OpenRouter is the default; pass your own model via the `agent_model` / `parsing_model` inputs if you prefer.
44+
4945
## Inputs
5046

51-
| Input | Description | Required | Default |
52-
|-------|-------------|----------|---------|
53-
| `repository_url` | Repository URL for which documentation will be generated | Yes | - |
54-
| `source_branch` | Source branch for comparison (typically the PR branch) | Yes | - |
55-
| `target_branch` | Target branch for comparison (typically the base branch) | Yes | - |
56-
| `output_directory` | Directory where documentation files will be saved | No | `docs` |
57-
| `output_format` | Format for documentation files (either `.md` or `.rst`) | No | `.md` |
47+
| Input | Default | Description |
48+
|---|---|---|
49+
| `llm_api_key` | (required) | LLM API key. Currently OpenRouter (`OPENROUTER_API_KEY`). |
50+
| `github_token` | `${{ github.token }}` | Token used to post the comment. |
51+
| `engine_ref` | `main` | Git ref of `CodeBoarding/CodeBoarding`. Pin in production. |
52+
| `depth_level` | `1` | Diagram depth (1–3). Higher = slower + more detail. |
53+
| `agent_model` | `openrouter/anthropic/claude-sonnet-4` | LLM for analysis. |
54+
| `parsing_model` | `openrouter/anthropic/claude-sonnet-4` | LLM for parsing. |
55+
| `comment_header` | `Architecture review` | Header line of the PR comment. |
56+
| `diagram_direction` | `LR` | Mermaid layout direction: `LR`, `TD`, `TB`, `RL`, or `BT`. |
57+
| `changed_only` | `false` | Draw only changed components and their incident edges. |
58+
| `nested` | `false` | Draw depth>1 sub-components as nested subgraphs (pair with `depth_level >= 2`). |
5859

5960
## Outputs
6061

6162
| Output | Description |
62-
|--------|-------------|
63-
| `markdown_files_created` | Number of documentation files created |
64-
| `json_files_created` | Number of JSON files created |
65-
| `output_directory` | Directory where documentation files were saved |
66-
| `json_directory` | Directory where JSON files were saved (always `.codeboarding`) |
67-
| `has_changes` | Whether any files were created or changed |
63+
|---|---|
64+
| `diagram_md` | Path to the rendered ```` ```mermaid ```` block in the runner workspace. |
65+
| `n_changed` | Number of top-level components added/modified/deleted. |
66+
| `truncated` | `true` if the diagram was reduced to changed-only to fit GitHub's Mermaid limit. |
67+
68+
## How the diff is colored
69+
70+
Nodes are styled with Mermaid `classDef` / `class`; arrows are styled with positional `linkStyle`. A relation counts as **modified** when its endpoints are unchanged but its label text changed. Example of the emitted block:
71+
72+
```mermaid
73+
graph LR
74+
Api["API Gateway"]
75+
Auth["Auth Service"]
76+
Cache["Cache"]
77+
Api -- "routes to" --> Auth
78+
Auth -- "reads/writes" --> Cache
79+
classDef added fill:#1f883d,stroke:#0b5d23,color:#ffffff;
80+
classDef modified fill:#bf8700,stroke:#7d4e00,color:#ffffff;
81+
classDef deleted fill:#cf222e,stroke:#82071e,color:#ffffff,stroke-dasharray:5 3;
82+
class Cache added;
83+
class Auth modified;
84+
class Api deleted;
85+
linkStyle 0 stroke:#cf222e,stroke-width:2px,stroke-dasharray:5 3;
86+
linkStyle 1 stroke:#1f883d,stroke-width:2px;
87+
```
6888

69-
## How It Works
89+
## No baseline required
7090

71-
The action works by:
91+
If `.codeboarding/analysis.json` isn't committed at the PR base commit, the action **generates the baseline itself** by running a full analysis on the base commit, then diffs the head against it. Committing a baseline on your default branch makes runs cheaper (the base run is skipped) and the diff more stable, but it is not required.
7292

73-
1. Analyzing the differences introduced in the source branch and putting the results in the target branch
74-
2. Generating documentation files based on the latest version of the source branch
75-
3. Outputting two types of files:
76-
- Documentation files (Markdown or RST) in the specified output directory
77-
- Metadata files in the `.codeboarding` directory
93+
## Fork PRs
7894

79-
## License
95+
Because nothing is pushed (the diagram is inline Mermaid), there is no image step to skip on forks. The one caveat is GitHub's own policy: **secrets are withheld from `pull_request`-triggered runs on forks**, so the LLM key is unavailable and the run fails early with a clear message. A maintainer can re-run from the Actions tab, or use `pull_request_target` if you understand its security implications.
8096

81-
MIT License - see [LICENSE](LICENSE) file for details.
97+
## Limitations
8298

83-
# CodeBoarding GitHub Action
99+
- **GitHub Mermaid caps.** Inline Mermaid in comments is capped (≈500 edges / 50 000 chars). The action stays under this by auto-falling-back to a changed-only graph; if even that overflows it posts a text summary instead of a broken diagram.
100+
- **Nesting.** By default only the top-level component graph is drawn (matching the engine's default `graph LR`). Set `nested: true` with `depth_level >= 2` to draw sub-components as nested subgraphs — leaf nodes filled, parent containers outlined, both colored by status. Large nested graphs are more likely to hit GitHub's Mermaid caps (above), in which case the action degrades to changed-only or a text summary.
101+
- **Renames show as remove + add.** Components are matched across the two analyses by name (the stable join), so a renamed component appears as a red removal plus a green addition rather than a single yellow change.
102+
- **No click-through.** GitHub renders Mermaid in strict security mode, so node hyperlinks are disabled.
84103

85-
## Important: Timeout Configuration
104+
## Local testing
86105

87-
For large repositories, the analysis can take 15-45 minutes. Make sure to configure appropriate timeouts in your workflow:
106+
A GitHub run is slow (engine install + two analyses). To iterate locally, use `scripts/run_local.sh`. It mirrors `action.yml` and writes `.cb-local/diagram.md` plus a `.cb-local/preview.html` you open in a browser (rendered with mermaid.js in GitHub's strict mode, so it looks like the comment will).
88107

89-
```yaml
90-
jobs:
91-
generate-docs:
92-
runs-on: ubuntu-latest
93-
timeout-minutes: 60 # Set to 60+ minutes for large repositories
94-
steps:
95-
- uses: actions/checkout@v4
96-
- uses: your-username/codeboarding-ghaction@v1
97-
with:
98-
# your inputs here
108+
**Fast — no LLM, instant.** Diff two existing `analysis.json` files. Great for iterating on colors/layout. For a realistic pair, pull two revisions of a committed analysis:
109+
110+
```bash
111+
git show <old-sha>:.codeboarding/analysis.json > /tmp/base.json
112+
git show <new-sha>:.codeboarding/analysis.json > /tmp/head.json
113+
scripts/run_local.sh --base-json /tmp/base.json --head-json /tmp/head.json
99114
```
100115

101-
## Timeout Guidelines
116+
**Full pipeline — needs an LLM key.** Runs the engine on two refs of a local repo exactly like the action (committed-or-generated base, then incremental head):
117+
118+
```bash
119+
export OPENROUTER_API_KEY=sk-or-...
120+
scripts/run_local.sh --repo /path/to/repo --base <base-ref> --head <head-ref> \
121+
--engine /path/to/CodeBoarding # defaults to ../CodeBoarding
122+
```
102123

103-
- **Small repositories** (<1k files): 10-15 minutes
104-
- **Medium repositories** (1k-5k files): 20-30 minutes
105-
- **Large repositories** (5k+ files): 30-60 minutes
106-
- **Very large repositories** (10k+ files): 45-90 minutes
124+
Flags: `--depth N`, `--direction LR|TD|…`, `--nested`, `--changed-only`, `--no-edge-labels`, `--out DIR`, `--no-open`.
125+
126+
The diagram step alone is also directly runnable:
127+
128+
```bash
129+
python3 scripts/diff_to_mermaid.py --base base/analysis.json --head head/analysis.json --out diagram.md
130+
```
131+
132+
## License
107133

108-
If your workflow consistently times out, consider:
109-
1. Increasing `timeout-minutes` to 90 or higher
110-
2. Running the action on a schedule during off-peak hours
111-
3. Analyzing specific branches with smaller diffs
134+
MIT — see [LICENSE](LICENSE).

0 commit comments

Comments
 (0)