Skip to content

Commit d525132

Browse files
Copilotlpcox
andauthored
feat: add 4 agentic code quality workflows (#2417)
* Initial plan * feat: add 4 agentic code quality workflows - duplicate-code-detector.md: weekly scan for near-duplicate code blocks, files issues for high-impact deduplication opportunities - test-coverage-reporter.md: weekly + on push to main, reports coverage trends as GitHub Discussions - refactoring-scanner.md: weekly scan for oversized/mixed-responsibility files, files refactoring issues - export-audit.md: on push to main, detects unused exports, naming inconsistencies, circular deps, wrong test imports All compiled to .lock.yml via gh aw compile and post-processed. Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/28e98601-c6b6-425c-9141-a0d0c455ec04 * fix: capitalize/punctuation review feedback in workflow descriptions Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/28e98601-c6b6-425c-9141-a0d0c455ec04 * feat: change duplicate-code-detector, refactoring-scanner, test-coverage-reporter to daily schedule Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/6bcb659e-1e94-4f2d-9d14-4071b8c26f60 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> * fix: address code review feedback on all 4 workflow .md files Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/f49cc459-e3d0-4f79-8c1e-f157a23d6a9b Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
1 parent d8e7897 commit d525132

8 files changed

Lines changed: 5304 additions & 0 deletions

.github/workflows/duplicate-code-detector.lock.yml

Lines changed: 1037 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
---
2+
description: |
3+
Daily workflow that scans the codebase for duplicate and near-duplicate code blocks,
4+
copy-paste patterns, and repeated logic sequences in TypeScript source and JavaScript
5+
container code. Files actionable issues for high-impact deduplication opportunities
6+
to prevent technical debt from accumulating silently.
7+
8+
on:
9+
schedule: daily
10+
workflow_dispatch:
11+
12+
permissions:
13+
contents: read
14+
issues: read
15+
16+
sandbox:
17+
agent:
18+
version: v0.25.29
19+
network:
20+
allowed:
21+
- node
22+
- github
23+
24+
tools:
25+
github:
26+
toolsets: [issues]
27+
bash: true
28+
29+
safe-outputs:
30+
threat-detection:
31+
enabled: false
32+
create-issue:
33+
title-prefix: "[Duplicate Code] "
34+
labels: [code-quality, refactoring]
35+
max: 5
36+
expires: 30d
37+
38+
timeout-minutes: 20
39+
---
40+
41+
# Duplicate Code Detector
42+
43+
You are a code quality engineer analyzing the `${{ github.repository }}` codebase for duplicated and near-duplicate code. Your mission is to surface high-impact deduplication opportunities that will reduce maintenance burden and improve consistency.
44+
45+
## Repository Context
46+
47+
This is **gh-aw-firewall**, a network firewall for GitHub Copilot CLI. The most important source files for duplication analysis are:
48+
49+
- `src/docker-manager.ts` — 3,900+ lines; container lifecycle, env-var construction, volume mounts
50+
- `src/cli.ts` — 1,700+ lines; argument parsing, orchestration, config merging
51+
- `containers/api-proxy/server.js` — provider-agnostic proxy server
52+
- `containers/api-proxy/providers/*.js` — per-provider adapter modules
53+
54+
## Phase 1: Gather Codebase Metrics
55+
56+
Run these commands to understand the scope before diving into duplication:
57+
58+
```bash
59+
# File sizes and line counts
60+
wc -l src/*.ts src/**/*.ts containers/api-proxy/*.js containers/api-proxy/providers/*.js 2>/dev/null | sort -rn | head -30
61+
62+
# Total files and lines
63+
echo "=== TypeScript source ==="
64+
find src -name "*.ts" ! -name "*.test.ts" | xargs wc -l 2>/dev/null | sort -rn | head -20
65+
echo "=== Container JS ==="
66+
find containers -name "*.js" | xargs wc -l 2>/dev/null | sort -rn | head -20
67+
```
68+
69+
## Phase 2: Detect Structural Duplication
70+
71+
Install and run the `jscpd` (JavaScript Copy/Paste Detector) tool to find literal code duplication:
72+
73+
```bash
74+
# Install jscpd
75+
npm install -g jscpd 2>&1 | tail -3
76+
77+
# Run duplicate detection on TypeScript source
78+
jscpd src --min-lines 10 --min-tokens 50 --reporters json --output /tmp/jscpd-src 2>&1 | tail -20
79+
80+
# Run on container JS
81+
jscpd containers --min-lines 10 --min-tokens 50 --reporters json --output /tmp/jscpd-containers 2>&1 | tail -20
82+
83+
# Show summary
84+
cat /tmp/jscpd-src/jscpd-report.json 2>/dev/null | node -e "
85+
const d = JSON.parse(require('fs').readFileSync('/dev/stdin', 'utf8'));
86+
const clones = d.duplicates || [];
87+
console.log('Total duplicates found:', clones.length);
88+
clones.slice(0, 10).forEach(c => {
89+
const f1 = c.firstFile?.name?.replace(process.cwd() + '/', '') || 'unknown';
90+
const f2 = c.secondFile?.name?.replace(process.cwd() + '/', '') || 'unknown';
91+
console.log(\` \${f1}:\${c.firstFile?.start}-\${c.firstFile?.end} <-> \${f2}:\${c.secondFile?.start}-\${c.secondFile?.end} (\${c.fragment?.split('\\n').length || 0} lines)\`);
92+
});
93+
" || echo "(jscpd report not available)"
94+
```
95+
96+
## Phase 3: Detect Pattern-Level Duplication
97+
98+
Use grep to find repeated code patterns that jscpd may not catch (semantic duplication):
99+
100+
```bash
101+
echo "=== Env-var reading/trimming patterns ==="
102+
grep -rn "process\.env\." src/ --include="*.ts" | grep -v "test" | head -40
103+
104+
echo "=== Docker exec/run command construction patterns ==="
105+
grep -n "execa\|execaSync\|docker.*run\|docker.*exec" src/docker-manager.ts | head -30
106+
107+
echo "=== Config/validation patterns in config-file.ts and schema-validator.ts ==="
108+
grep -n "throw\|error\|invalid\|validate" src/config-file.ts | head -20
109+
grep -n "throw\|error\|invalid\|validate" src/schema-validator.ts 2>/dev/null | head -20
110+
111+
echo "=== Repeated try/catch error handling patterns ==="
112+
grep -n -A 3 "catch (e" src/docker-manager.ts | head -60
113+
114+
echo "=== Provider adapter patterns in api-proxy ==="
115+
for f in containers/api-proxy/providers/*.js; do
116+
echo "--- $f ---"
117+
grep -n "function\|const.*=.*(" "$f" | head -10
118+
done
119+
120+
echo "=== Repeated log construction patterns ==="
121+
grep -rn "logger\.\(debug\|info\|warn\|error\)" src/ --include="*.ts" | \
122+
sed 's/.*logger\.\(debug\|info\|warn\|error\)(\(.*\))/\2/' | \
123+
sort | uniq -d | head -20
124+
```
125+
126+
## Phase 4: Analyze Specific Known Duplication Areas
127+
128+
Based on codebase knowledge, deeply analyze the most likely duplication hotspots:
129+
130+
```bash
131+
echo "=== docker-manager.ts: env-var construction ==="
132+
grep -n "env\[.*\]\s*=\|envVars\.\|\.trim()\|process\.env\." src/docker-manager.ts | head -50
133+
134+
echo "=== docker-manager.ts: repeated docker compose args patterns ==="
135+
grep -n "composeArgs\|dockerArgs\|\-f.*compose\|--project-name" src/docker-manager.ts | head -30
136+
137+
echo "=== cli.ts: option handling patterns ==="
138+
grep -n "\.option\|options\.\|program\." src/cli.ts | head -50
139+
140+
echo "=== API proxy provider similarity (getConfig patterns) ==="
141+
for f in containers/api-proxy/providers/openai.js containers/api-proxy/providers/anthropic.js containers/api-proxy/providers/gemini.js containers/api-proxy/providers/copilot.js containers/api-proxy/providers/opencode.js; do
142+
if [ -f "$f" ]; then
143+
echo "--- $f: exported functions ---"
144+
grep -n "^function\|^const.*=\s*function\|^module\.exports\|^exports\." "$f" | head -10
145+
fi
146+
done
147+
148+
echo "=== proxy-utils.js: shared utilities ==="
149+
cat containers/api-proxy/proxy-utils.js 2>/dev/null | head -60
150+
```
151+
152+
## Phase 5: Check for Existing Issues
153+
154+
Before filing new issues, check what's already been reported:
155+
156+
1. Search for open issues with `[Duplicate Code]` prefix using the GitHub toolset
157+
2. Also search for issues with labels `code-quality` or `refactoring` that describe duplication
158+
3. Skip any finding that already has an open tracking issue
159+
160+
## Phase 6: Prioritize and Report Findings
161+
162+
Based on your analysis, identify the **top duplications by impact** using this scoring:
163+
164+
| Factor | Points |
165+
|--------|--------|
166+
| >20 duplicate lines | +3 |
167+
| Affects security-critical path | +3 |
168+
| In file >1000 lines (maintenance burden) | +2 |
169+
| More than 2 copies | +2 |
170+
| Easy to extract (no complex dependencies) | +1 |
171+
172+
Report only findings with score ≥ 4.
173+
174+
### For each high-impact finding, create an issue with this format:
175+
176+
**Title**: `[Duplicate Code] <brief description of what is duplicated>`
177+
178+
**Body**:
179+
```markdown
180+
## Duplicate Code Opportunity
181+
182+
### Summary
183+
- **Pattern**: Brief description of what is being duplicated
184+
- **Locations**: File(s) and line ranges containing duplicates
185+
- **Impact**: Lines saved / maintenance burden reduction
186+
187+
### Evidence
188+
189+
<Show the specific duplicated code blocks side by side>
190+
191+
### Suggested Refactoring
192+
193+
Describe the shared utility or abstraction that would eliminate the duplication.
194+
For example:
195+
- Extract a `parseEnvVars(obj)` helper in `src/env-utils.ts`
196+
- Create a base class or mixin for provider adapters
197+
- Add a `buildDockerArgs(config)` factory function
198+
199+
### Affected Files
200+
- `path/to/file.ts` — lines X-Y
201+
- `path/to/other.ts` — lines A-B
202+
203+
### Effort Estimate
204+
Low / Medium / High
205+
206+
---
207+
*Detected by Duplicate Code Detector workflow. Run date: $(date -u +"%Y-%m-%d")*
208+
```
209+
210+
## Guidelines
211+
212+
- **Be specific**: Always include file paths and line numbers in the evidence section
213+
- **Be actionable**: Each issue should have a clear, implementable suggestion
214+
- **Avoid noise**: Only file issues for genuine duplication with real maintenance impact — not cosmetic similarities
215+
- **No duplicates**: Check existing open issues before creating new ones
216+
- **Security awareness**: Flag duplicated security-critical logic (domain validation, ACL rules, capability management) with higher urgency
217+
- **Cap at 5 issues**: File at most 5 issues per run to avoid flooding the tracker
218+
219+
## Edge Cases
220+
221+
- **No significant duplication found**: Exit gracefully without creating issues; print a summary to the log
222+
- **jscpd unavailable**: Fall back to grep-based pattern analysis only
223+
- **All findings already tracked**: Skip creation and log that existing issues cover the findings

0 commit comments

Comments
 (0)