Skip to content

fix(site-audit): build files miscategorized as config; summary shows truncated counts; SPECKIT_VERSION not in pre-scan output #1

@markhazleton

Description

@markhazleton

Summary

Three bugs found in speckit.site-audit — in the PowerShell pre-scan script (.documentation/scripts/powershell/site-audit.ps1) and its summary output. All three have been reproduced and fixed in markhazleton/git-spark on 2026-03-30.


Bug 1 — Build files always reported as 0

Symptom: The build file count is always 0 even when .github/workflows/*.yml files exist.

Root cause: The file categorization loop checks extensions before it checks paths. .yml/.yaml files match the config extension list and hit a continue before the build path check (\.github/workflows/) is ever evaluated.

Fix: Move the build path check (.github/workflows/, Dockerfile, Makefile) before the config extension check and add a continue so it short-circuits correctly.

# BEFORE (wrong order — .yml files claimed by config first)
if ($ext -in $configExtensions ...) { $categories.config += ...; continue }
...
if ($relativePath -match '\.github/workflows/') { $categories.build += ... }   # ← never reached for .yml

# AFTER (build path checked first)
if ($relativePath -match '(^|/)\.github/workflows/' ...) { $categories.build += ...; continue }
if ($ext -in $configExtensions ...) { $categories.config += ...; continue }

Observed impact (git-spark repo):

Field Before fix After fix
Build files 0 5
Config files 16 11

Bug 2 — Summary output shows truncated counts instead of true totals

Symptom: The -OutputFormat summary output reports large_files.Count, hardcoded_secrets.Count, etc. These arrays are capped at $LargeFileLimit / $PatternSampleLimit (both default 25), so on repos with more findings the printed count is silently capped at ≤ 25.

Root cause: After the sampling step the script stores actual totals in _total properties (large_files_total, hardcoded_secrets_total, insecure_patterns_total, todo_comments_total) but the summary block never uses them — it reads .Count on the already-truncated arrays instead.

Fix: Replace .Count references in the summary output with the pre-computed _total properties:

# BEFORE
Write-Output "  Large files (>500 lines): $($result.metrics.large_files.Count)"
Write-Output "  Potential secrets: $($result.patterns.security.hardcoded_secrets.Count)"
Write-Output "  Insecure patterns: $($result.patterns.security.insecure_patterns.Count)"
Write-Output "  TODO/FIXME comments: $($result.patterns.quality.todo_comments.Count)"

# AFTER
Write-Output "  Large files (>500 lines): $($result.metrics.large_files_total)"
Write-Output "  Potential secrets: $($result.patterns.security.hardcoded_secrets_total)"
Write-Output "  Insecure patterns: $($result.patterns.security.insecure_patterns_total)"
Write-Output "  TODO/FIXME comments: $($result.patterns.quality.todo_comments_total)"

Bug 3 — SPECKIT_VERSION not included in pre-scan JSON output

Symptom: The JSON emitted by site-audit.ps1 -Json contains no information from .documentation/SPECKIT_VERSION. The audit mode prompt's version-check step must read the file in a separate tool call, adding latency and a step that can fail silently when the stamp is absent.

Fix: Add a Get-SpeckitVersion helper that reads the stamp file and returns structured data. Include the result as a top-level speckit key in the output object — available in both JSON and summary modes.

{
  "speckit": {
    "stamp_exists": true,
    "installed_version": "1.5.1",
    "installed_date": "2026-03-30",
    "agent": "copilot"
  }
}

Summary line added:

Spec Kit Version: 1.5.1

Reproduction

On any repo that has .github/workflows/*.yml files, run:

.documentation/scripts/powershell/site-audit.ps1 -OutputFormat summary
# Expected: Build files: 5 (or however many .yml workflow files exist)
# Actual:   Build files: 0

Impact

Bug Affected output Severity
Build files = 0 JSON + summary Medium — VER3/VER4 workflow findings can be silently missed
Truncated counts in summary Summary only Low — JSON totals are correct
No SPECKIT_VERSION in JSON JSON Low — requires extra tool call to read stamp

Discovered and fixed in markhazleton/git-spark on 2026-03-30.

Metadata

Metadata

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions