Skip to content

Commit eb5f50d

Browse files
authored
Add binlog-analysis skill (build-failure triage via binlog-mcp) (#19957)
* Add binlog-analysis skill (MCP-first build-failure triage) Fetches a build's MSBuild binary log (local build, or a failed fsharp-ci Azure DevOps PR build) and analyzes it live via the binlog-mcp MCP server (Microsoft.AITools.BinlogMcp): structured errors, root-cause diagnosis, and an MSBuild perf X-ray. - .github/skills/binlog-analysis/: SKILL.md + scripts/Get-Binlog.ps1 (acquisition; analysis delegated to the MCP). - .config/dotnet-tools.json: pin Microsoft.AITools.BinlogMcp so `dotnet tool restore` provisions the MCP server (restores from the existing dnceng feed). - Point existing build-failure entry points at the skill. * Add missing name field to agentic-workflows agent The skill-validator (.github/skills CI) requires a 'name' frontmatter field on every agent. agentic-workflows.agent.md lacked one, failing the 'Validate skills and agents' check. * Pin binlog-mcp to public 1.0.0 release 1.0.0 is the canonical public release of Microsoft.AITools.BinlogMcp; it restores from the existing dnceng dotnet-tools feed (verified) and has the same tool surface. No NuGet.config change.
1 parent 35851a1 commit eb5f50d

8 files changed

Lines changed: 273 additions & 0 deletions

File tree

.config/dotnet-tools.json

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,13 @@
5757
"ilverify"
5858
],
5959
"rollForward": false
60+
},
61+
"microsoft.aitools.binlogmcp": {
62+
"version": "1.0.0",
63+
"commands": [
64+
"binlog-mcp"
65+
],
66+
"rollForward": true
6067
}
6168
}
6269
}

.github/agents/agentic-workflows.agent.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
---
2+
name: agentic-workflows
23
description: GitHub Agentic Workflows (gh-aw) - Create, debug, and upgrade AI-powered workflows with intelligent prompt routing
34
disable-model-invocation: true
45
---

.github/agents/compiler-perf-investigator.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ description: Specialized agent for investigating F# build performance issues usi
77

88
These are **general investigation instructions** for this agent, a template for perf analysis of slow/problematic F# compilation and build, suitable for a variety of scenarios (repos, snippets, gists).
99

10+
**Related tools:** the `binlog-analysis` skill fetches a build's MSBuild `.binlog` and analyzes it live via the `binlog-mcp` MCP — structured errors, root-cause diagnosis, and target/task/analyzer timings — a fast first pass over a build log before deeper trace/dump analysis.
11+
1012
---
1113

1214
## PRINCIPLES OF OPERATION

.github/copilot-instructions.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Build fails → 99% YOUR previous change broke it. You ARE the compiler.
1616
DON'T say "pre-existing", "infra issue", "unrelated".
1717
DO `git clean -xfd artifacts` and rebuild.
1818
Bootstrap contamination: early commits break compiler → later "fixes" still use broken bootstrap. Clean fully.
19+
Triage a build failure → `binlog-analysis` skill fetches the binlog (local build or failed AzDo PR build) and analyzes it live via the `binlog-mcp` MCP (structured errors, root-cause diagnose, MSBuild perf X-ray).
1920

2021
## Test
2122

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
name: binlog-analysis
3+
description: >-
4+
Triage a build / compile / restore / WarnAsError failure from its MSBuild
5+
binary log. Fetches the binlog (a local build's, or a failed dotnet/fsharp
6+
Azure DevOps PR build's published artifact) and analyzes it live via the
7+
`binlog-mcp` MCP server — structured errors, root-cause diagnosis, and an
8+
MSBuild perf X-ray. NOT for test failures or CheckCodeFormatting: a build
9+
binlog has no errors there.
10+
---
11+
12+
# Binlog Analysis (via the binlog-mcp MCP server)
13+
14+
Boil a failed build down to root causes. This skill does two small things and
15+
delegates the heavy lifting to an MCP server:
16+
17+
1. **Fetch** the build's `*.binlog` — from your local build, or from a failed
18+
`fsharp-ci` Azure DevOps PR build (downloads the published artifact).
19+
2. **Hand the path to the `binlog-mcp` MCP server** (`Microsoft.AITools.BinlogMcp`),
20+
which queries the binlog live (≈38 tools): structured errors, categorized root
21+
causes, and an MSBuild X-ray (target/task/analyzer timings, incrementality,
22+
double-writes, …).
23+
24+
Because the analysis lives in the MCP server, this skill stays tiny — and gets
25+
better automatically as that server gains features.
26+
27+
## When to use
28+
29+
- A local build (with `-bl`) or a failed `fsharp-ci` PR build broke and you need
30+
to know **why** — compile / restore / analyzer / WarnAsError errors, or build
31+
perf.
32+
33+
## When NOT to use
34+
35+
- **Test** failures or **CheckCodeFormatting**. The build binlog has no errors in
36+
that case: `binlog_overview` will show the build succeeded / 0 errors — stop and
37+
use `pr-build-status` / `flaky-test-detector` instead.
38+
39+
## Step 1 — get the binlog path
40+
41+
```pwsh
42+
# Local: newest *.binlog under <repo>/artifacts/log (build first, e.g. ./build.sh --binaryLog)
43+
pwsh .github/skills/binlog-analysis/scripts/Get-Binlog.ps1
44+
45+
# Local: a specific file, directory, or glob
46+
pwsh .github/skills/binlog-analysis/scripts/Get-Binlog.ps1 -BinlogPath artifacts/log/Debug/Build.binlog
47+
48+
# Azure DevOps: latest FAILED fsharp-ci build for a PR (downloads + keeps the binlog)
49+
pwsh .github/skills/binlog-analysis/scripts/Get-Binlog.ps1 -PrNumber 19941
50+
51+
# Azure DevOps: explicit build id; -AllLegs for every leg; -Json for a path list
52+
pwsh .github/skills/binlog-analysis/scripts/Get-Binlog.ps1 -BuildId 1462217 -Json
53+
```
54+
55+
It prints the resolved `*.binlog` path(s) (Azure DevOps artifacts are downloaded
56+
to a temp folder and kept so the MCP can read them).
57+
58+
## Step 2 — analyze via the binlog-mcp MCP tools
59+
60+
With each path, call the MCP server (the argument is `binlog_file`):
61+
62+
- `binlog_overview` — build status + error/warning counts. **Call this first** to
63+
decide whether there's anything to analyze.
64+
- `binlog_diagnose` — categorized root causes + next-step hints.
65+
- `binlog_errors` / `binlog_warnings` — structured diagnostics (code / file / line
66+
/ column / project).
67+
- `binlog_search` — free-form drill-down.
68+
- Perf: `binlog_expensive_targets` / `binlog_expensive_tasks` /
69+
`binlog_expensive_analyzers`, `binlog_incremental_analysis`,
70+
`binlog_double_writes`, `binlog_target_graph`.
71+
72+
> **Multi-targeting note.** Today `binlog_errors` returns one row per target
73+
> framework, so a single source error in a multi-TFM project (e.g.
74+
> FSharp.Compiler.Service → `net10.0;netstandard2.0`) appears once per TFM. A
75+
> lossless dedup (`code,file,line` → set of TFMs) is proposed upstream in
76+
> `dotnet-microsoft/ai-tools`; when it lands, this skill gets the deduped view for
77+
> free — no change here.
78+
79+
## Prerequisites
80+
81+
- The **binlog-mcp MCP server** registered with your agent. The tool is pinned in
82+
the repo's `.config/dotnet-tools.json` (`Microsoft.AITools.BinlogMcp`); run
83+
`dotnet tool restore` once, then register it as an MCP server. For example,
84+
Copilot CLI (`~/.copilot/mcp-config.json`):
85+
```jsonc
86+
{ "mcpServers": { "binlog-mcp": {
87+
"command": "dotnet", "args": ["tool", "run", "binlog-mcp"],
88+
"tools": ["*"], "deferTools": "auto" } } }
89+
```
90+
(gh-aw: a `mcp-servers:` block; VS Code: `.vscode/mcp.json`. Telemetry is on by
91+
default — opt out with `DOTNET_CLI_TELEMETRY_OPTOUT=1` if desired.)
92+
- PowerShell 7+ (`pwsh`) and a .NET 10 SDK (already required by the repo).
93+
- Azure DevOps modes need network access to `dev.azure.com`.
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
<#
2+
.SYNOPSIS
3+
Resolve a build's .binlog and print its path, for live analysis via the
4+
`binlog-mcp` MCP server (Microsoft.AITools.BinlogMcp). Works on a local
5+
build's binlog or on a failed dotnet/fsharp Azure DevOps PR build's
6+
published binlog.
7+
8+
.DESCRIPTION
9+
This skill's job is ACQUISITION only — it does not analyze. It locates (and,
10+
for Azure DevOps, downloads) the binary log and prints the path(s). Hand each
11+
path to the binlog-mcp MCP tools (binlog_overview, binlog_diagnose,
12+
binlog_errors, ...) with `binlog_file: <path>`.
13+
14+
Source (pick one):
15+
* Local — pass -BinlogPath, or run with no arguments to auto-discover the
16+
newest *.binlog under <repo>/artifacts/log.
17+
* Azure DevOps — pass -PrNumber (latest failed `fsharp-ci` build) or an
18+
explicit -BuildId; the build-leg binlog artifact is downloaded
19+
to a temp folder and KEPT, so the MCP can read it afterwards.
20+
21+
.PARAMETER BinlogPath
22+
Local binlog source: a .binlog file, a directory (newest *.binlog inside, or
23+
all with -AllLegs), or a glob. No download is performed.
24+
25+
.PARAMETER PrNumber
26+
GitHub PR number in dotnet/fsharp. The latest failed build is used.
27+
28+
.PARAMETER BuildId
29+
Explicit Azure DevOps build id.
30+
31+
.PARAMETER AllLegs
32+
Include every binlog rather than just the build leg / newest: all binlog
33+
artifacts for an AzDo build, or every *.binlog in a local directory.
34+
35+
.PARAMETER Json
36+
Emit the resolved path(s) as JSON (`{ "binlogs": [ ... ] }`).
37+
38+
.EXAMPLE
39+
# Newest local build binlog (build first, e.g. ./build.sh --binaryLog):
40+
pwsh Get-Binlog.ps1
41+
42+
.EXAMPLE
43+
pwsh Get-Binlog.ps1 -BinlogPath artifacts/log/Debug/Build.binlog
44+
45+
.EXAMPLE
46+
pwsh Get-Binlog.ps1 -PrNumber 19941
47+
48+
.EXAMPLE
49+
pwsh Get-Binlog.ps1 -BuildId 1462217 -Json
50+
#>
51+
[CmdletBinding(DefaultParameterSetName = 'Local')]
52+
param(
53+
[Parameter(ParameterSetName = 'ByPath', Mandatory, Position = 0)]
54+
[string[]]$BinlogPath,
55+
56+
[Parameter(ParameterSetName = 'ByPr', Mandatory, Position = 0)]
57+
[int]$PrNumber,
58+
59+
[Parameter(ParameterSetName = 'ByBuild', Mandatory, Position = 0)]
60+
[long]$BuildId,
61+
62+
[string]$Org = 'dnceng-public',
63+
[string]$Project = 'public',
64+
[int]$Definition = 90,
65+
[switch]$AllLegs,
66+
[switch]$Json
67+
)
68+
69+
$ErrorActionPreference = 'Stop'
70+
$api = "https://dev.azure.com/$Org/$Project/_apis/build"
71+
72+
function Resolve-BuildId([int]$pr) {
73+
$url = "$api/builds?definitions=$Definition&reasonFilter=pullRequest&statusFilter=completed&`$top=100&api-version=7.1"
74+
$builds = (Invoke-RestMethod -Uri $url).value |
75+
Where-Object { $_.triggerInfo.'pr.number' -eq "$pr" }
76+
if (-not $builds) { throw "No completed PR builds found for PR #$pr (definition $Definition)." }
77+
$failed = $builds | Where-Object { $_.result -eq 'failed' } | Sort-Object finishTime -Descending
78+
$chosen = if ($failed) { $failed[0] } else { ($builds | Sort-Object finishTime -Descending)[0] }
79+
Write-Host "PR #$pr -> build $($chosen.id) ($($chosen.result), finished $($chosen.finishTime))"
80+
return $chosen.id
81+
}
82+
83+
function Get-RepoRoot { (Resolve-Path (Join-Path $PSScriptRoot '..\..\..\..')).Path }
84+
85+
function Resolve-LocalBinlogs([string[]]$paths, [bool]$all) {
86+
$out = [System.Collections.Generic.List[string]]::new()
87+
foreach ($p in $paths) {
88+
if (Test-Path -LiteralPath $p -PathType Leaf) {
89+
if ([IO.Path]::GetExtension($p) -eq '.binlog') { $out.Add((Resolve-Path -LiteralPath $p).Path) }
90+
else { Write-Warning "Skipping non-binlog file: $p" }
91+
}
92+
elseif (Test-Path -LiteralPath $p -PathType Container) {
93+
$found = Get-ChildItem -LiteralPath $p -Recurse -Filter *.binlog -ErrorAction SilentlyContinue |
94+
Sort-Object LastWriteTime -Descending
95+
if (-not $found) { throw "No *.binlog under directory: $p" }
96+
if ($all) { $found | ForEach-Object { $out.Add($_.FullName) } } else { $out.Add($found[0].FullName) }
97+
}
98+
else {
99+
$glob = Get-ChildItem -Path $p -ErrorAction SilentlyContinue | Where-Object { $_.Extension -eq '.binlog' }
100+
if (-not $glob) { throw "Path not found or no .binlog match: $p" }
101+
$glob | ForEach-Object { $out.Add($_.FullName) }
102+
}
103+
}
104+
return $out
105+
}
106+
107+
$binlogs = [System.Collections.Generic.List[string]]::new()
108+
109+
switch ($PSCmdlet.ParameterSetName) {
110+
'Local' {
111+
$logDir = Join-Path (Get-RepoRoot) 'artifacts/log'
112+
if (-not (Test-Path $logDir)) {
113+
throw "No local build logs at '$logDir'. Build with a binary log first " +
114+
"(e.g. ./build.sh --binaryLog or eng/Build.ps1 -binaryLog), or pass -BinlogPath."
115+
}
116+
Write-Host "Auto-discovering newest binlog under $logDir ..."
117+
$binlogs = Resolve-LocalBinlogs @($logDir) $AllLegs.IsPresent
118+
}
119+
'ByPath' {
120+
$binlogs = Resolve-LocalBinlogs $BinlogPath $AllLegs.IsPresent
121+
}
122+
default {
123+
# Azure DevOps modes (ByPr / ByBuild): download the build-leg binlog
124+
# artifact and KEEP it so the binlog-mcp MCP server can read it.
125+
if ($PSCmdlet.ParameterSetName -eq 'ByPr') { $BuildId = Resolve-BuildId $PrNumber }
126+
$artifacts = (Invoke-RestMethod -Uri "$api/builds/$BuildId/artifacts?api-version=7.1").value
127+
$selected = if ($AllLegs) {
128+
$artifacts | Where-Object { $_.name -match 'binlog' }
129+
} else {
130+
$build = $artifacts | Where-Object { $_.name -match 'build binlog' }
131+
if ($build) { $build } else { $artifacts | Where-Object { $_.name -match 'binlog' } }
132+
}
133+
if (-not $selected) { throw "Build $BuildId has no binlog artifacts." }
134+
135+
$downloadDir = Join-Path ([IO.Path]::GetTempPath()) "binlog-analysis-$BuildId"
136+
if (Test-Path $downloadDir) { Remove-Item -Recurse -Force $downloadDir }
137+
New-Item -ItemType Directory -Force -Path $downloadDir | Out-Null
138+
foreach ($a in $selected) {
139+
$zip = Join-Path $downloadDir "$($a.name -replace '[^\w.-]', '_').zip"
140+
Write-Host "Downloading artifact: $($a.name)"
141+
Invoke-WebRequest -Uri $a.resource.downloadUrl -OutFile $zip
142+
$dest = Join-Path $downloadDir ($a.name -replace '[^\w.-]', '_')
143+
Expand-Archive -Path $zip -DestinationPath $dest -Force
144+
Get-ChildItem $dest -Recurse -Filter *.binlog | ForEach-Object { $binlogs.Add($_.FullName) }
145+
}
146+
Write-Host "Kept under: $downloadDir"
147+
}
148+
}
149+
150+
if ($binlogs.Count -eq 0) { throw "No .binlog files resolved." }
151+
152+
if ($Json) {
153+
[pscustomobject]@{ binlogs = @($binlogs) } | ConvertTo-Json
154+
return
155+
}
156+
157+
Write-Host ''
158+
Write-Host "Resolved $($binlogs.Count) binlog(s):"
159+
foreach ($b in $binlogs) { Write-Host " $b" }
160+
Write-Host ''
161+
Write-Host 'Next: analyze with the binlog-mcp MCP server (arg name is binlog_file):'
162+
Write-Host ' binlog_overview { binlog_file: "<path>" } # build status + error/warning counts'
163+
Write-Host ' binlog_diagnose { binlog_file: "<path>" } # categorized root causes + next steps'
164+
Write-Host ' binlog_errors { binlog_file: "<path>" } # structured errors (code/file/line/project)'
165+
Write-Host 'If the build succeeded / 0 errors (e.g. a test-only or formatting failure), there is nothing to analyze.'

.github/skills/hypothesis-driven-debugging/SKILL.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ Use this skill when:
1616
- Troubleshooting performance regressions
1717
- Examining warning/error message issues
1818

19+
> **Related:** for a build / compile / restore failure, run the `binlog-analysis` skill first — it fetches the build's MSBuild binary log and analyzes it live via the `binlog-mcp` MCP (structured errors + root-cause diagnosis), a fast way to scope the minimal reproduction below.
20+
1921
## Core Principles
2022

2123
1. **Always start with a minimal reproduction**

.github/skills/pr-build-status/SKILL.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ compatibility: Requires GitHub CLI (gh) authenticated with access to dotnet/fsha
1111

1212
Retrieve and systematically analyze Azure DevOps build failures for GitHub PRs.
1313

14+
> **Related:** for build / compile / restore / WarnAsError failures, the `binlog-analysis` skill fetches the failed build's MSBuild binary log (local build or AzDo PR build) and analyzes it live via the `binlog-mcp` MCP — structured errors, root-cause diagnosis, and an MSBuild perf X-ray. Use it once you have the failed build or PR number.
15+
1416
## CRITICAL: Collect-First Workflow
1517

1618
**DO NOT push fixes until ALL errors are collected and reproduced locally.**

0 commit comments

Comments
 (0)