Skip to content

Promote dev to main (v5.x: ADO pipeline integration, AI guidance, notification-taxonomy refinement)#2936

Merged
michaelcfanning merged 9 commits into
mainfrom
dev
May 26, 2026
Merged

Promote dev to main (v5.x: ADO pipeline integration, AI guidance, notification-taxonomy refinement)#2936
michaelcfanning merged 9 commits into
mainfrom
dev

Conversation

@michaelcfanning
Copy link
Copy Markdown
Member

Summary

Promote seven commits accumulated on dev since the last main sync (since the v4.6.4 cut). All under the v5.x release line; nothing else in flight. origin/dev is 7 ahead, 0 behind origin/main.

The work groups into three coherent stories. Per-PR detail is in each linked PR; ReleaseHistory.md UNRELEASED is the canonical artifact.

1. ADO pipeline integration & sample publishing

  • emit-init-run: auto-stamp ADO pipeline automationDetails from env + GHAzDO sample #2929emit-init-run auto-stamps automationDetails.id and the four azuredevops/pipeline/build/* properties from the standard ADO pipeline env vars when TF_BUILD=True. Adds Microsoft.CodeAnalysis.Sarif.Multitool.AdoPipelineContext. Composes with --automation-guid / --automation-correlation-guid. Partial or malformed env state fails the verb before any file is written. Companion GHAzDO1019/1020 rules validate the resulting shape.
  • Add PublishSampleToGhazdo.ps1 + clone-aware CweGenerateSample.ps1 #2931src/Sarif/Taxonomies/PublishSampleToGhazdo.ps1 POSTs gzipped SARIF to the GHAzDO SARIFs ingestion endpoint, parsing org/project/repo from versionControlProvenance[0].repositoryUri. CweGenerateSample.ps1 derives --vcp-repositoryuri and the --srcroot prefix from git remote get-url origin with a microsoft/sarif-sdk fallback; on the canonical clone the fixtures stay byte-identical.

2. AI generation guidance port

3. add-reporting-descriptor verb + notification-taxonomy refinement

  • Add multitool add-reporting-descriptor verb #2933 — Adds multitool add-reporting-descriptor. Default target is tool.driver.notifications[]; --rules retargets to tool.driver.rules[] and gates the descriptor id against AIRuleIdConvention.IsNovel (only NOVEL-* ids may be self-registered as rules; taxonomy-mapped rules like CWE-89 come from the taxonomy enricher). Duplicate id is rejected on receipt. Adds SarifEventKinds.RuleDescriptor and SarifEventKinds.NotificationDescriptor; the replayer merges producer-supplied descriptors into the target list before result-driven auto-registration so an explicit NOVEL- descriptor pre-empts the minimal one that would otherwise be synthesized.
  • Strip editorial prefixes from AI notification taxonomy #2934BRK. Strips editorial prefixes from the AI notification taxonomy. Ids now name the concern only — DECISION, RULED-OUT, DATA-ACCESS-DENIED, TOOL-UNAVAILABLE, etc. The array (toolExecutionNotifications vs toolConfigurationNotifications) encodes the kind; tool.driver.name encodes the emitter. Same id MAY now legally appear in both arrays. Routing moves to the verb (add-notification --config / -c); event-log kind notification splits into execution-notification / configuration-notification; the replayer routes each to the matching invocation array. AI1014.ExecutionNotificationPlacement removed (its sole purpose was enforcing the now-dropped prefix-vs-array consistency).
  • Generalize ALAS-SIGNAL notification id to LEARNING-SIGNAL #2935BRK. Generalizes ALAS-SIGNAL to LEARNING-SIGNAL. ALAS named a specific consumer; the new id names the concern. AI2018.ProvideExecutionSignalArtifact renames to AI2018.ProvideLearningSignalArtifact for the same reason (the "Execution" qualifier was redundant under the new convention, and downstream learning systems aren't only reading the execution-side array). Closes out the prefix-strip work end-to-end.

Verification

  • All seven PRs merged independently into dev after passing their own build + test gates at merge time.
  • Last verification on dev HEAD (68f147fa): dotnet build src/Sarif.Sdk.sln -c Release → 0 warnings, 0 errors. dotnet test Test.UnitTests.Sarif.Multitool.Library → 196 passed, 1 skipped (pre-existing), 0 failed.
  • origin/main has zero commits not on dev — clean fast-forwardable promotion.

Breaking change disclosure

Two BRK bullets (both under UNRELEASED, both in #2934/#2935). Beyond the HashAlgorithms consolidation already on dev (and prior Ado → GHAzDO rename), v5.x adds:

AI-rule adoption is low; v5.x is the place to take refinement over back-compat.

Merge strategy

This is a dev → main promotion of seven already-squashed feature PRs. Use a merge commit (gh pr merge --merge), not a squash, so each feature PR's narrative and (#NNNN) suffix survive on main. Do not delete the dev branch.

michaelcfanning and others added 7 commits May 25, 2026 15:54
…HAzDO sample (#2929)

* emit-init-run: auto-stamp ADO pipeline automationDetails from env + GHAzDO sample

Adds AdoPipelineContext, which detects an Azure DevOps pipeline
execution context from the standard predefined environment variables
and stamps run.automationDetails so producers that run inside ADO
pipelines automatically satisfy GHAzDO1019 and GHAzDO1020 with no
additional CLI flags.

- TryDetect is three-state (None / Partial / Complete). Partial fails
  loudly with a per-variable diagnostic before any file-system side
  effects so a misconfigured pipeline never emits a half-stamped SARIF.
- ApplyTo writes the canonical
  azuredevops/pipeline/build/<org>/<projectId>/<buildDefId>/<phaseId>/<branchRef>/<buildId>
  id and the four azuredevops/pipeline/build/* property keys ADO
  Advanced Security ingestion validates.
- Composes with the existing --automation-guid / --automation-correlation-guid
  flags; never overwrites a producer-supplied guid/correlationGuid.

CweGenerateSample.ps1 grows a -GHAzDO switch that produces the new
CweGHAzDoSample.sarif fixture alongside the existing CweSample.sarif.
The script populates the ADO env vars for the duration of emit-init-run
so AdoPipelineContext stamps automationDetails, then patches
tool.driver.fullName post-finalize so GHAzDO1018 passes. Default-mode
runs explicitly clear those same env vars so a developer shell with
TF_BUILD=True can never drift the AI-shape fixture.

CweGHAzDoSample.sarif validates with zero errors, zero warnings, and
zero notes under --rule-kind Sarif;AI;GHAzDO. CweGeneratedSampleTests
covers both fixtures with byte-identical regression gates as separate
[Fact]s sharing one private helper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trim ReleaseHistory bullets + add copilot-instructions.md

The two bullets I just added for env-driven ADO stamping and the
GHAzDO sample fixture were PR-description-sized, not release-note-sized.
Trimmed both to match the style of their neighbors (single self-contained
sentence + concrete names + minimal facts a downstream consumer needs).
The full narrative — three-state detection prose, env-var precedence
table, composition guarantees — already lives on PR #2929 where it
belongs.

Adds .github/copilot-instructions.md so future agents in this repo see
the release-notes-vs-PR-description distinction up front, plus the
house idioms that come up repeatedly in code review (no [Theory],
GHAzDO casing, AI ruleId convention, sample-fixture convention,
side-effects-after-detection, internals-via-InternalsVisibleTo).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make sarif-sdk the single source of truth for the SARIF spec markdown,
the AI-generated-findings profile, and the agent skills that emit and
validate AI SARIF.

Adds:
- docs/spec/sarif-v2.1.0-spec.md
  Convenience markdown rendering of the OASIS SARIF 2.1.0 specification
  (Plus Errata 01). The OASIS-published document is canonical; IPR notice
  preserved at top of file.
- docs/ai/generating-sarif.md
  Normative guidance for representing AI/LLM-produced security findings
  as first-class SARIF: ai/origin declaration, tool identity, result
  structure, exploitability and attacker-position vocabulary, evidence
  model, redaction, notification taxonomy (AI/EXEC/*, AI/CFG/*), and
  the full AI rule-pack appendix. Includes a Mermaid object-model
  diagram in the appendix.
- docs/ai/example.sarif
  Comprehensive reference SARIF log that conforms to the AI profile.
  Passes `dotnet sarif validate --rule-kind 'Sarif;AI'` cleanly.
- skills/emit-sarif-findings/SKILL.md
  Agent-operating procedure for emitting AI SARIF using the
  Sarif.Multitool emit verbs (emit-init-run, add-result,
  add-notification, emit-finalize --validate). Multitool-only;
  cross-references docs/ai/generating-sarif.md as the normative source.
- skills/validate-sarif-findings/SKILL.md
  Agent-operating procedure for validating AI SARIF. Uses
  `--rule-kind 'Sarif;AI'` against the multitool's AI rule pack
  (AI1003-AI2019) plus the standard SARIF rules in one pass.

Updates:
- README.md adds a short pointer section to the new spec, guidance,
  and skills directories.
- docs/multitool-usage.md gains a 'Modes' table entry for each of the
  new emit verbs (emit-init-run, add-result, add-notification,
  emit-finalize) plus a worked example.

Verification gates run before commit:
- `dotnet sarif validate docs/ai/example.sarif --rule-kind 'Sarif;AI'`
  reports 0 errors.
- End-to-end smoke test (init -> add-result -> finalize --validate)
  produces a SARIF file with 1 result, 1 rule (CWE-78 enriched from
  the embedded MITRE CWE taxonomy).
- All skill command snippets match actual --help output for the
  relevant verb at Sarif.Multitool 5.0.0.

Companion work (separate PR in microsoft/ai-plugins):
- Delete plugins/sarif/ entirely; the canonical home is now this
  repository.
- Retool Swallowtail (and other AI-detector plugins in ai-plugins)
  to invoke Sarif.Multitool emit verbs directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
)

`CweGenerateSample.ps1` now derives `--vcp-repositoryuri` and the
`emit-finalize --srcroot` prefix from `git -C $repoRoot remote get-url
origin`, falling back to `https://github.com/microsoft/sarif-sdk` when
origin is unset. On the canonical microsoft/sarif-sdk clone the generated
fixtures (CweSample.sarif, CweGHAzDoSample.sarif) are byte-identical to
the previous hardcoded form. GitHub origins get a `<repo>/blob/main/`
SRCROOT prefix; other hosts (including ADO) get the bare repo URL with
a trailing slash.

Adds `src/Sarif/Taxonomies/PublishSampleToGhazdo.ps1` -- POSTs a gzipped
SARIF to the GHAzDO SARIFs ingestion endpoint
(`/{org}/{project}/_apis/alert/repositories/{repo}/sarifs?api-version=
7.2-preview.1` on advsec.dev.azure.com, fallback dev.azure.com). Target
org/project/repo are parsed from runs[0].versionControlProvenance[0]
.repositoryUri; PAT is read from the ADO_PAT environment variable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Public-OSS hygiene pass on the SARIF AI guidance and skills.
Descriptor ids that are already shipped in the SDK (AI/EXEC/ALAS-SIGNAL
in AI2018.ProvideExecutionSignalArtifact and AI1014's AI/EXEC/* and
AI/CFG/* prefixes) are kept as-is so the docs match the current SDK
implementation.

Changes:
- Drop ALAS expansion and neutralize the signal-payload schema
  (descriptor id kept; no payload schema was ever enforced by the SDK).
- Replace ProjectApi with FastAPI (five sites) in API-handler examples.
- Replace 'Geneva cluster' with 'telemetry cluster' in a deployment
  example.
- Replace example rule id SWT-CPP-001 with ACME-CPP-001.
- Replace author: mikefan with sarif-sdk-maintainers in both skill
  frontmatters.
- Soften a reference to an unpublished companion remediation guidance
  document.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Appends a fully-formed SARIF reportingDescriptor JSON object — supplied
via --input <path> or stdin — to the staged event log produced by
emit-init-run.

Two targets:
* Default → run.tool.driver.notifications[]. AI producers routinely emit
  notification descriptors (progress, telemetry, config errors). No id
  convention is enforced; notifications use opaque ids.
* --rules → run.tool.driver.rules[]. Gated against
  AIRuleIdConvention.IsNovel so only NOVEL- novel-finding descriptors
  are accepted. Taxonomy-mapped rule descriptors (e.g., CWE-89) come
  from the taxonomy enricher at finalize time, not from this verb.

Each descriptor id may appear at most once per event log. The verb scans
the existing event log on receipt and rejects duplicates against either
a prior add-reporting-descriptor event of the same target OR a
descriptor pre-populated on the run-header. A --force escape hatch is
acknowledged in error text but intentionally out of v1 scope.

Event-log plumbing:
* Adds SarifEventKinds.RuleDescriptor ("rule-descriptor") and
  SarifEventKinds.NotificationDescriptor ("notification-descriptor"),
  threaded through SarifEventLogReader's kind allow-list.
* SarifEventReplayer buffers descriptor events and merges them into the
  target list BEFORE RegisterDescriptorsFromResults runs. This ordering
  matters: auto-registration synthesizes minimal descriptors only for
  ruleIds that aren't already represented, so an explicit NOVEL-
  descriptor pre-empts the minimal one. Header pre-populated descriptors
  are preserved by reference; the verb's emit-time dedup blocks
  id collisions between header and events.
* New event kinds are additive within CurrentSchemaVersion = 1; older
  readers will skip unknown kinds harmlessly, matching the forward-
  compat shape used when Notification / Invocation kinds were added.

Tests:
* 16 [Fact] tests on AddReportingDescriptorCommand covering both happy
  paths (notifications default, --rules), id validation (missing/empty/
  non-string), the NOVEL gate (taxonomy id rejection on --rules path
  only), rich payload round-trip (messageStrings, defaultConfiguration,
  helpUri, properties — including a date-shaped property string to guard
  against Json.NET DateTime coercion), duplicate detection within and
  across targets, duplicate detection against header-pre-populated
  descriptors for both target arrays, missing-wip-file path, and two
  malformed-input cases (bad JSON, non-object root).
* 3 [Fact] tests on SarifEventReplayer covering: rule-descriptor events
  populating rules and pre-empting auto-registration, notification-
  descriptor events populating notifications, and the
  header-pre-populated + events merge semantics.

No [Theory]/[InlineData] — repeated scenarios use shared private
helpers (SeedRunHeader) per house style.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Notification descriptor ids now name the concern only — `DECISION`,
`RULED-OUT`, `DATA-ACCESS-DENIED`, `ALAS-SIGNAL`, `TOOL-UNAVAILABLE`,
etc. The previous `AI/EXEC/*` and `AI/CFG/*` prefixes repeated context
the surrounding SARIF already carries: the array
(`toolExecutionNotifications` vs `toolConfigurationNotifications`)
encodes the kind, and `tool.driver.name` encodes the emitter. The
same id MAY now legally appear in both arrays. Suffixing `EXEC` or
`CFG` on every id is like suffixing `Class` on every C# class — the
surrounding context already says what kind of thing it is.

Placement is selected at authoring time: `add-notification` defaults
to `toolExecutionNotifications`; `add-notification --config` (`-c`)
routes to `toolConfigurationNotifications`. The event-log kind
`SarifEventKinds.Notification` splits into `ExecutionNotification`
(`"execution-notification"`) and `ConfigurationNotification`
(`"configuration-notification"`); the replayer routes each to the
matching invocation array.

`AI1014.ExecutionNotificationPlacement` is deleted. Its sole purpose
was enforcing prefix-vs-array consistency, which is structurally
meaningless under the new convention (the array IS the kind).
`AI2018` retains its semantic; the literal id it checks changes from
`AI/EXEC/ALAS-SIGNAL` to `ALAS-SIGNAL`.

BRK by the letter of v4.6.3 (AI1014 was added there), but AI rules
adoption is low and v5.x is the right place for refinement over
back-compat.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ALAS named a specific consumer (an internal learning system). Under
the convention shipped in #2934, notification ids name the concern,
not the consumer. LEARNING-SIGNAL describes what the signal is,
independent of who reads it.

While here, rename the AI2018 rule from ProvideExecutionSignalArtifact
to ProvideLearningSignalArtifact for consistency: the class checks
the LEARNING-SIGNAL id, the "Execution" qualifier was redundant under
the new convention (placement is encoded by the array, not the id),
and downstream learning systems aren't necessarily reading only the
execution-side array.

Affects: AI2018 rule class + file + RuleId const + 3 resource
keys/messages, the AI2018 row in docs and skills tables, and the
UNRELEASED BRK bullet for #2934 (whose own "ALAS-SIGNAL example"
becomes "LEARNING-SIGNAL", and which now documents the
class-and-id rename together).

BRK on the just-merged BRK (both still UNRELEASED) — favored over
shipping the consumer name.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build.props was already bumped to <VersionPrefix>5.0.0</VersionPrefix>
in #2924 (the SHA-1 BRK). This finishes the v5.0.0 cut by replacing
the UNRELEASED placeholder header in ReleaseHistory.md with the
canonical version banner (Sdk / Driver / Converters / Multitool /
Multitool Library nuget links), matching the v4.6.4 format.

Picked up by #2936 (the dev to main promotion PR).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…2938)

Per release-notes house style, bullets are one or two self-contained
sentences; PR-description prose belongs in the PR. The original bullet
was ~3x the length of its neighbors and re-litigated the motivation.

Split into two tighter bullets:

  1. Convention change + routing mechanism (id-prefix strip, new
     --config switch, event-kind split).
  2. Rule-table changes (AI1014 removal, AI2018 rename).

Drops the "prefixes were redundant because..." explanation, the wire
value parentheticals (`"execution-notification"` etc.), and the
"ALAS named a specific consumer" parenthetical. The change itself
is visible in the renames; the reader doesn't need the rationale.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@michaelcfanning michaelcfanning merged commit 4ef765d into main May 26, 2026
6 checks passed
michaelcfanning added a commit that referenced this pull request May 26, 2026
)

* emit-init-run: auto-stamp ADO pipeline automationDetails from env + GHAzDO sample (#2929)

* emit-init-run: auto-stamp ADO pipeline automationDetails from env + GHAzDO sample

Adds AdoPipelineContext, which detects an Azure DevOps pipeline
execution context from the standard predefined environment variables
and stamps run.automationDetails so producers that run inside ADO
pipelines automatically satisfy GHAzDO1019 and GHAzDO1020 with no
additional CLI flags.

- TryDetect is three-state (None / Partial / Complete). Partial fails
  loudly with a per-variable diagnostic before any file-system side
  effects so a misconfigured pipeline never emits a half-stamped SARIF.
- ApplyTo writes the canonical
  azuredevops/pipeline/build/<org>/<projectId>/<buildDefId>/<phaseId>/<branchRef>/<buildId>
  id and the four azuredevops/pipeline/build/* property keys ADO
  Advanced Security ingestion validates.
- Composes with the existing --automation-guid / --automation-correlation-guid
  flags; never overwrites a producer-supplied guid/correlationGuid.

CweGenerateSample.ps1 grows a -GHAzDO switch that produces the new
CweGHAzDoSample.sarif fixture alongside the existing CweSample.sarif.
The script populates the ADO env vars for the duration of emit-init-run
so AdoPipelineContext stamps automationDetails, then patches
tool.driver.fullName post-finalize so GHAzDO1018 passes. Default-mode
runs explicitly clear those same env vars so a developer shell with
TF_BUILD=True can never drift the AI-shape fixture.

CweGHAzDoSample.sarif validates with zero errors, zero warnings, and
zero notes under --rule-kind Sarif;AI;GHAzDO. CweGeneratedSampleTests
covers both fixtures with byte-identical regression gates as separate
[Fact]s sharing one private helper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trim ReleaseHistory bullets + add copilot-instructions.md

The two bullets I just added for env-driven ADO stamping and the
GHAzDO sample fixture were PR-description-sized, not release-note-sized.
Trimmed both to match the style of their neighbors (single self-contained
sentence + concrete names + minimal facts a downstream consumer needs).
The full narrative — three-state detection prose, env-var precedence
table, composition guarantees — already lives on PR #2929 where it
belongs.

Adds .github/copilot-instructions.md so future agents in this repo see
the release-notes-vs-PR-description distinction up front, plus the
house idioms that come up repeatedly in code review (no [Theory],
GHAzDO casing, AI ruleId convention, sample-fixture convention,
side-effects-after-detection, internals-via-InternalsVisibleTo).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Port SARIF AI generation guidance from ai-plugins to sarif-sdk (#2930)

Make sarif-sdk the single source of truth for the SARIF spec markdown,
the AI-generated-findings profile, and the agent skills that emit and
validate AI SARIF.

Adds:
- docs/spec/sarif-v2.1.0-spec.md
  Convenience markdown rendering of the OASIS SARIF 2.1.0 specification
  (Plus Errata 01). The OASIS-published document is canonical; IPR notice
  preserved at top of file.
- docs/ai/generating-sarif.md
  Normative guidance for representing AI/LLM-produced security findings
  as first-class SARIF: ai/origin declaration, tool identity, result
  structure, exploitability and attacker-position vocabulary, evidence
  model, redaction, notification taxonomy (AI/EXEC/*, AI/CFG/*), and
  the full AI rule-pack appendix. Includes a Mermaid object-model
  diagram in the appendix.
- docs/ai/example.sarif
  Comprehensive reference SARIF log that conforms to the AI profile.
  Passes `dotnet sarif validate --rule-kind 'Sarif;AI'` cleanly.
- skills/emit-sarif-findings/SKILL.md
  Agent-operating procedure for emitting AI SARIF using the
  Sarif.Multitool emit verbs (emit-init-run, add-result,
  add-notification, emit-finalize --validate). Multitool-only;
  cross-references docs/ai/generating-sarif.md as the normative source.
- skills/validate-sarif-findings/SKILL.md
  Agent-operating procedure for validating AI SARIF. Uses
  `--rule-kind 'Sarif;AI'` against the multitool's AI rule pack
  (AI1003-AI2019) plus the standard SARIF rules in one pass.

Updates:
- README.md adds a short pointer section to the new spec, guidance,
  and skills directories.
- docs/multitool-usage.md gains a 'Modes' table entry for each of the
  new emit verbs (emit-init-run, add-result, add-notification,
  emit-finalize) plus a worked example.

Verification gates run before commit:
- `dotnet sarif validate docs/ai/example.sarif --rule-kind 'Sarif;AI'`
  reports 0 errors.
- End-to-end smoke test (init -> add-result -> finalize --validate)
  produces a SARIF file with 1 result, 1 rule (CWE-78 enriched from
  the embedded MITRE CWE taxonomy).
- All skill command snippets match actual --help output for the
  relevant verb at Sarif.Multitool 5.0.0.

Companion work (separate PR in microsoft/ai-plugins):
- Delete plugins/sarif/ entirely; the canonical home is now this
  repository.
- Retool Swallowtail (and other AI-detector plugins in ai-plugins)
  to invoke Sarif.Multitool emit verbs directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add PublishSampleToGhazdo.ps1 + clone-aware CweGenerateSample.ps1 (#2931)

`CweGenerateSample.ps1` now derives `--vcp-repositoryuri` and the
`emit-finalize --srcroot` prefix from `git -C $repoRoot remote get-url
origin`, falling back to `https://github.com/microsoft/sarif-sdk` when
origin is unset. On the canonical microsoft/sarif-sdk clone the generated
fixtures (CweSample.sarif, CweGHAzDoSample.sarif) are byte-identical to
the previous hardcoded form. GitHub origins get a `<repo>/blob/main/`
SRCROOT prefix; other hosts (including ADO) get the bare repo URL with
a trailing slash.

Adds `src/Sarif/Taxonomies/PublishSampleToGhazdo.ps1` -- POSTs a gzipped
SARIF to the GHAzDO SARIFs ingestion endpoint
(`/{org}/{project}/_apis/alert/repositories/{repo}/sarifs?api-version=
7.2-preview.1` on advsec.dev.azure.com, fallback dev.azure.com). Target
org/project/repo are parsed from runs[0].versionControlProvenance[0]
.repositoryUri; PAT is read from the ADO_PAT environment variable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scrub Microsoft-internal references from AI guidance port (#2932)

Public-OSS hygiene pass on the SARIF AI guidance and skills.
Descriptor ids that are already shipped in the SDK (AI/EXEC/ALAS-SIGNAL
in AI2018.ProvideExecutionSignalArtifact and AI1014's AI/EXEC/* and
AI/CFG/* prefixes) are kept as-is so the docs match the current SDK
implementation.

Changes:
- Drop ALAS expansion and neutralize the signal-payload schema
  (descriptor id kept; no payload schema was ever enforced by the SDK).
- Replace ProjectApi with FastAPI (five sites) in API-handler examples.
- Replace 'Geneva cluster' with 'telemetry cluster' in a deployment
  example.
- Replace example rule id SWT-CPP-001 with ACME-CPP-001.
- Replace author: mikefan with sarif-sdk-maintainers in both skill
  frontmatters.
- Soften a reference to an unpublished companion remediation guidance
  document.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add multitool add-reporting-descriptor verb

Appends a fully-formed SARIF reportingDescriptor JSON object — supplied
via --input <path> or stdin — to the staged event log produced by
emit-init-run.

Two targets:
* Default → run.tool.driver.notifications[]. AI producers routinely emit
  notification descriptors (progress, telemetry, config errors). No id
  convention is enforced; notifications use opaque ids.
* --rules → run.tool.driver.rules[]. Gated against
  AIRuleIdConvention.IsNovel so only NOVEL- novel-finding descriptors
  are accepted. Taxonomy-mapped rule descriptors (e.g., CWE-89) come
  from the taxonomy enricher at finalize time, not from this verb.

Each descriptor id may appear at most once per event log. The verb scans
the existing event log on receipt and rejects duplicates against either
a prior add-reporting-descriptor event of the same target OR a
descriptor pre-populated on the run-header. A --force escape hatch is
acknowledged in error text but intentionally out of v1 scope.

Event-log plumbing:
* Adds SarifEventKinds.RuleDescriptor ("rule-descriptor") and
  SarifEventKinds.NotificationDescriptor ("notification-descriptor"),
  threaded through SarifEventLogReader's kind allow-list.
* SarifEventReplayer buffers descriptor events and merges them into the
  target list BEFORE RegisterDescriptorsFromResults runs. This ordering
  matters: auto-registration synthesizes minimal descriptors only for
  ruleIds that aren't already represented, so an explicit NOVEL-
  descriptor pre-empts the minimal one. Header pre-populated descriptors
  are preserved by reference; the verb's emit-time dedup blocks
  id collisions between header and events.
* New event kinds are additive within CurrentSchemaVersion = 1; older
  readers will skip unknown kinds harmlessly, matching the forward-
  compat shape used when Notification / Invocation kinds were added.

Tests:
* 16 [Fact] tests on AddReportingDescriptorCommand covering both happy
  paths (notifications default, --rules), id validation (missing/empty/
  non-string), the NOVEL gate (taxonomy id rejection on --rules path
  only), rich payload round-trip (messageStrings, defaultConfiguration,
  helpUri, properties — including a date-shaped property string to guard
  against Json.NET DateTime coercion), duplicate detection within and
  across targets, duplicate detection against header-pre-populated
  descriptors for both target arrays, missing-wip-file path, and two
  malformed-input cases (bad JSON, non-object root).
* 3 [Fact] tests on SarifEventReplayer covering: rule-descriptor events
  populating rules and pre-empting auto-registration, notification-
  descriptor events populating notifications, and the
  header-pre-populated + events merge semantics.

No [Theory]/[InlineData] — repeated scenarios use shared private
helpers (SeedRunHeader) per house style.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Strip editorial prefixes from AI notification taxonomy (#2934)

Notification descriptor ids now name the concern only — `DECISION`,
`RULED-OUT`, `DATA-ACCESS-DENIED`, `ALAS-SIGNAL`, `TOOL-UNAVAILABLE`,
etc. The previous `AI/EXEC/*` and `AI/CFG/*` prefixes repeated context
the surrounding SARIF already carries: the array
(`toolExecutionNotifications` vs `toolConfigurationNotifications`)
encodes the kind, and `tool.driver.name` encodes the emitter. The
same id MAY now legally appear in both arrays. Suffixing `EXEC` or
`CFG` on every id is like suffixing `Class` on every C# class — the
surrounding context already says what kind of thing it is.

Placement is selected at authoring time: `add-notification` defaults
to `toolExecutionNotifications`; `add-notification --config` (`-c`)
routes to `toolConfigurationNotifications`. The event-log kind
`SarifEventKinds.Notification` splits into `ExecutionNotification`
(`"execution-notification"`) and `ConfigurationNotification`
(`"configuration-notification"`); the replayer routes each to the
matching invocation array.

`AI1014.ExecutionNotificationPlacement` is deleted. Its sole purpose
was enforcing prefix-vs-array consistency, which is structurally
meaningless under the new convention (the array IS the kind).
`AI2018` retains its semantic; the literal id it checks changes from
`AI/EXEC/ALAS-SIGNAL` to `ALAS-SIGNAL`.

BRK by the letter of v4.6.3 (AI1014 was added there), but AI rules
adoption is low and v5.x is the right place for refinement over
back-compat.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Generalize ALAS-SIGNAL notification id to LEARNING-SIGNAL (#2935)

ALAS named a specific consumer (an internal learning system). Under
the convention shipped in #2934, notification ids name the concern,
not the consumer. LEARNING-SIGNAL describes what the signal is,
independent of who reads it.

While here, rename the AI2018 rule from ProvideExecutionSignalArtifact
to ProvideLearningSignalArtifact for consistency: the class checks
the LEARNING-SIGNAL id, the "Execution" qualifier was redundant under
the new convention (placement is encoded by the array, not the id),
and downstream learning systems aren't necessarily reading only the
execution-side array.

Affects: AI2018 rule class + file + RuleId const + 3 resource
keys/messages, the AI2018 row in docs and skills tables, and the
UNRELEASED BRK bullet for #2934 (whose own "ALAS-SIGNAL example"
becomes "LEARNING-SIGNAL", and which now documents the
class-and-id rename together).

BRK on the just-merged BRK (both still UNRELEASED) — favored over
shipping the consumer name.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stamp ReleaseHistory UNRELEASED section as v5.0.0 (#2937)

build.props was already bumped to <VersionPrefix>5.0.0</VersionPrefix>
in #2924 (the SHA-1 BRK). This finishes the v5.0.0 cut by replacing
the UNRELEASED placeholder header in ReleaseHistory.md with the
canonical version banner (Sdk / Driver / Converters / Multitool /
Multitool Library nuget links), matching the v4.6.4 format.

Picked up by #2936 (the dev to main promotion PR).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trim and split the over-descriptive v5.0.0 notification-taxonomy BRK (#2938)

Per release-notes house style, bullets are one or two self-contained
sentences; PR-description prose belongs in the PR. The original bullet
was ~3x the length of its neighbors and re-litigated the motivation.

Split into two tighter bullets:

  1. Convention change + routing mechanism (id-prefix strip, new
     --config switch, event-kind split).
  2. Rule-table changes (AI1014 removal, AI2018 rename).

Drops the "prefixes were redundant because..." explanation, the wire
value parentheticals (`"execution-notification"` etc.), and the
"ALAS named a specific consumer" parenthetical. The change itself
is visible in the renames; the reader doesn't need the rationale.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix CweGenerateSample.ps1 -GHAzDO crash inside real ADO pipelines (#2940)

The mseng microsoft.sarif-sdk pipeline broke on the first build of main
after the v5.x promotion (build 31555367). Symptom:

  CweGenerateSample.ps1 (args: -Configuration Release -GHAzDO) exited with code 1.
  ADO pipeline context is partially configured. Either populate every
  required variable or clear them all.
  Problems:
    BUILD_DEFINITIONID='1234' disagrees with SYSTEM_DEFINITIONID='9978'
    (both name the same pipeline identifier and must match)

Root cause: the deterministic-fixture env override in CweGenerateSample.ps1
stamps BUILD_DEFINITIONID=1234 for byte-stable output, but does not also
override SYSTEM_DEFINITIONID. ADO agents inject both. The verb's
must-match cross-check in AdoPipelineContext.TryDetect (correctly) refuses
to proceed when the two disagree.

Fix the script (not the verb): add SYSTEM_DEFINITIONID alongside
BUILD_DEFINITIONID in the \ ordered hashtable, plus
SYSTEM_JOBID / SYSTEM_JOBNAME alongside SYSTEM_PHASEID / SYSTEM_PHASENAME
for symmetric hygiene (those pairs are exempt from must-match but the
default-mode \ cleanup loop iterates \ and
benefits from covering the agent's full fallback set). The fixture SARIF
bytes do not change — the primary env vars were already set and are the
ones the verb actually reads.

Regression gate: new
  CweGHAzDoSample_RegenerationSucceeds_WhenAmbientAdoFallbackEnvVarsConflict
[Fact] explicitly seeds SYSTEM_DEFINITIONID / SYSTEM_JOBID / SYSTEM_JOBNAME
with values that disagree with the script's deterministic primaries
before invoking the script. Without the script fix it fails the same way
the mseng build did; with the fix it passes byte-identical.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
michaelcfanning added a commit that referenced this pull request May 26, 2026
* emit-init-run: auto-stamp ADO pipeline automationDetails from env + GHAzDO sample (#2929)

* emit-init-run: auto-stamp ADO pipeline automationDetails from env + GHAzDO sample

Adds AdoPipelineContext, which detects an Azure DevOps pipeline
execution context from the standard predefined environment variables
and stamps run.automationDetails so producers that run inside ADO
pipelines automatically satisfy GHAzDO1019 and GHAzDO1020 with no
additional CLI flags.

- TryDetect is three-state (None / Partial / Complete). Partial fails
  loudly with a per-variable diagnostic before any file-system side
  effects so a misconfigured pipeline never emits a half-stamped SARIF.
- ApplyTo writes the canonical
  azuredevops/pipeline/build/<org>/<projectId>/<buildDefId>/<phaseId>/<branchRef>/<buildId>
  id and the four azuredevops/pipeline/build/* property keys ADO
  Advanced Security ingestion validates.
- Composes with the existing --automation-guid / --automation-correlation-guid
  flags; never overwrites a producer-supplied guid/correlationGuid.

CweGenerateSample.ps1 grows a -GHAzDO switch that produces the new
CweGHAzDoSample.sarif fixture alongside the existing CweSample.sarif.
The script populates the ADO env vars for the duration of emit-init-run
so AdoPipelineContext stamps automationDetails, then patches
tool.driver.fullName post-finalize so GHAzDO1018 passes. Default-mode
runs explicitly clear those same env vars so a developer shell with
TF_BUILD=True can never drift the AI-shape fixture.

CweGHAzDoSample.sarif validates with zero errors, zero warnings, and
zero notes under --rule-kind Sarif;AI;GHAzDO. CweGeneratedSampleTests
covers both fixtures with byte-identical regression gates as separate
[Fact]s sharing one private helper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trim ReleaseHistory bullets + add copilot-instructions.md

The two bullets I just added for env-driven ADO stamping and the
GHAzDO sample fixture were PR-description-sized, not release-note-sized.
Trimmed both to match the style of their neighbors (single self-contained
sentence + concrete names + minimal facts a downstream consumer needs).
The full narrative — three-state detection prose, env-var precedence
table, composition guarantees — already lives on PR #2929 where it
belongs.

Adds .github/copilot-instructions.md so future agents in this repo see
the release-notes-vs-PR-description distinction up front, plus the
house idioms that come up repeatedly in code review (no [Theory],
GHAzDO casing, AI ruleId convention, sample-fixture convention,
side-effects-after-detection, internals-via-InternalsVisibleTo).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Port SARIF AI generation guidance from ai-plugins to sarif-sdk (#2930)

Make sarif-sdk the single source of truth for the SARIF spec markdown,
the AI-generated-findings profile, and the agent skills that emit and
validate AI SARIF.

Adds:
- docs/spec/sarif-v2.1.0-spec.md
  Convenience markdown rendering of the OASIS SARIF 2.1.0 specification
  (Plus Errata 01). The OASIS-published document is canonical; IPR notice
  preserved at top of file.
- docs/ai/generating-sarif.md
  Normative guidance for representing AI/LLM-produced security findings
  as first-class SARIF: ai/origin declaration, tool identity, result
  structure, exploitability and attacker-position vocabulary, evidence
  model, redaction, notification taxonomy (AI/EXEC/*, AI/CFG/*), and
  the full AI rule-pack appendix. Includes a Mermaid object-model
  diagram in the appendix.
- docs/ai/example.sarif
  Comprehensive reference SARIF log that conforms to the AI profile.
  Passes `dotnet sarif validate --rule-kind 'Sarif;AI'` cleanly.
- skills/emit-sarif-findings/SKILL.md
  Agent-operating procedure for emitting AI SARIF using the
  Sarif.Multitool emit verbs (emit-init-run, add-result,
  add-notification, emit-finalize --validate). Multitool-only;
  cross-references docs/ai/generating-sarif.md as the normative source.
- skills/validate-sarif-findings/SKILL.md
  Agent-operating procedure for validating AI SARIF. Uses
  `--rule-kind 'Sarif;AI'` against the multitool's AI rule pack
  (AI1003-AI2019) plus the standard SARIF rules in one pass.

Updates:
- README.md adds a short pointer section to the new spec, guidance,
  and skills directories.
- docs/multitool-usage.md gains a 'Modes' table entry for each of the
  new emit verbs (emit-init-run, add-result, add-notification,
  emit-finalize) plus a worked example.

Verification gates run before commit:
- `dotnet sarif validate docs/ai/example.sarif --rule-kind 'Sarif;AI'`
  reports 0 errors.
- End-to-end smoke test (init -> add-result -> finalize --validate)
  produces a SARIF file with 1 result, 1 rule (CWE-78 enriched from
  the embedded MITRE CWE taxonomy).
- All skill command snippets match actual --help output for the
  relevant verb at Sarif.Multitool 5.0.0.

Companion work (separate PR in microsoft/ai-plugins):
- Delete plugins/sarif/ entirely; the canonical home is now this
  repository.
- Retool Swallowtail (and other AI-detector plugins in ai-plugins)
  to invoke Sarif.Multitool emit verbs directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add PublishSampleToGhazdo.ps1 + clone-aware CweGenerateSample.ps1 (#2931)

`CweGenerateSample.ps1` now derives `--vcp-repositoryuri` and the
`emit-finalize --srcroot` prefix from `git -C $repoRoot remote get-url
origin`, falling back to `https://github.com/microsoft/sarif-sdk` when
origin is unset. On the canonical microsoft/sarif-sdk clone the generated
fixtures (CweSample.sarif, CweGHAzDoSample.sarif) are byte-identical to
the previous hardcoded form. GitHub origins get a `<repo>/blob/main/`
SRCROOT prefix; other hosts (including ADO) get the bare repo URL with
a trailing slash.

Adds `src/Sarif/Taxonomies/PublishSampleToGhazdo.ps1` -- POSTs a gzipped
SARIF to the GHAzDO SARIFs ingestion endpoint
(`/{org}/{project}/_apis/alert/repositories/{repo}/sarifs?api-version=
7.2-preview.1` on advsec.dev.azure.com, fallback dev.azure.com). Target
org/project/repo are parsed from runs[0].versionControlProvenance[0]
.repositoryUri; PAT is read from the ADO_PAT environment variable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Scrub Microsoft-internal references from AI guidance port (#2932)

Public-OSS hygiene pass on the SARIF AI guidance and skills.
Descriptor ids that are already shipped in the SDK (AI/EXEC/ALAS-SIGNAL
in AI2018.ProvideExecutionSignalArtifact and AI1014's AI/EXEC/* and
AI/CFG/* prefixes) are kept as-is so the docs match the current SDK
implementation.

Changes:
- Drop ALAS expansion and neutralize the signal-payload schema
  (descriptor id kept; no payload schema was ever enforced by the SDK).
- Replace ProjectApi with FastAPI (five sites) in API-handler examples.
- Replace 'Geneva cluster' with 'telemetry cluster' in a deployment
  example.
- Replace example rule id SWT-CPP-001 with ACME-CPP-001.
- Replace author: mikefan with sarif-sdk-maintainers in both skill
  frontmatters.
- Soften a reference to an unpublished companion remediation guidance
  document.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add multitool add-reporting-descriptor verb

Appends a fully-formed SARIF reportingDescriptor JSON object — supplied
via --input <path> or stdin — to the staged event log produced by
emit-init-run.

Two targets:
* Default → run.tool.driver.notifications[]. AI producers routinely emit
  notification descriptors (progress, telemetry, config errors). No id
  convention is enforced; notifications use opaque ids.
* --rules → run.tool.driver.rules[]. Gated against
  AIRuleIdConvention.IsNovel so only NOVEL- novel-finding descriptors
  are accepted. Taxonomy-mapped rule descriptors (e.g., CWE-89) come
  from the taxonomy enricher at finalize time, not from this verb.

Each descriptor id may appear at most once per event log. The verb scans
the existing event log on receipt and rejects duplicates against either
a prior add-reporting-descriptor event of the same target OR a
descriptor pre-populated on the run-header. A --force escape hatch is
acknowledged in error text but intentionally out of v1 scope.

Event-log plumbing:
* Adds SarifEventKinds.RuleDescriptor ("rule-descriptor") and
  SarifEventKinds.NotificationDescriptor ("notification-descriptor"),
  threaded through SarifEventLogReader's kind allow-list.
* SarifEventReplayer buffers descriptor events and merges them into the
  target list BEFORE RegisterDescriptorsFromResults runs. This ordering
  matters: auto-registration synthesizes minimal descriptors only for
  ruleIds that aren't already represented, so an explicit NOVEL-
  descriptor pre-empts the minimal one. Header pre-populated descriptors
  are preserved by reference; the verb's emit-time dedup blocks
  id collisions between header and events.
* New event kinds are additive within CurrentSchemaVersion = 1; older
  readers will skip unknown kinds harmlessly, matching the forward-
  compat shape used when Notification / Invocation kinds were added.

Tests:
* 16 [Fact] tests on AddReportingDescriptorCommand covering both happy
  paths (notifications default, --rules), id validation (missing/empty/
  non-string), the NOVEL gate (taxonomy id rejection on --rules path
  only), rich payload round-trip (messageStrings, defaultConfiguration,
  helpUri, properties — including a date-shaped property string to guard
  against Json.NET DateTime coercion), duplicate detection within and
  across targets, duplicate detection against header-pre-populated
  descriptors for both target arrays, missing-wip-file path, and two
  malformed-input cases (bad JSON, non-object root).
* 3 [Fact] tests on SarifEventReplayer covering: rule-descriptor events
  populating rules and pre-empting auto-registration, notification-
  descriptor events populating notifications, and the
  header-pre-populated + events merge semantics.

No [Theory]/[InlineData] — repeated scenarios use shared private
helpers (SeedRunHeader) per house style.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Strip editorial prefixes from AI notification taxonomy (#2934)

Notification descriptor ids now name the concern only — `DECISION`,
`RULED-OUT`, `DATA-ACCESS-DENIED`, `ALAS-SIGNAL`, `TOOL-UNAVAILABLE`,
etc. The previous `AI/EXEC/*` and `AI/CFG/*` prefixes repeated context
the surrounding SARIF already carries: the array
(`toolExecutionNotifications` vs `toolConfigurationNotifications`)
encodes the kind, and `tool.driver.name` encodes the emitter. The
same id MAY now legally appear in both arrays. Suffixing `EXEC` or
`CFG` on every id is like suffixing `Class` on every C# class — the
surrounding context already says what kind of thing it is.

Placement is selected at authoring time: `add-notification` defaults
to `toolExecutionNotifications`; `add-notification --config` (`-c`)
routes to `toolConfigurationNotifications`. The event-log kind
`SarifEventKinds.Notification` splits into `ExecutionNotification`
(`"execution-notification"`) and `ConfigurationNotification`
(`"configuration-notification"`); the replayer routes each to the
matching invocation array.

`AI1014.ExecutionNotificationPlacement` is deleted. Its sole purpose
was enforcing prefix-vs-array consistency, which is structurally
meaningless under the new convention (the array IS the kind).
`AI2018` retains its semantic; the literal id it checks changes from
`AI/EXEC/ALAS-SIGNAL` to `ALAS-SIGNAL`.

BRK by the letter of v4.6.3 (AI1014 was added there), but AI rules
adoption is low and v5.x is the right place for refinement over
back-compat.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Generalize ALAS-SIGNAL notification id to LEARNING-SIGNAL (#2935)

ALAS named a specific consumer (an internal learning system). Under
the convention shipped in #2934, notification ids name the concern,
not the consumer. LEARNING-SIGNAL describes what the signal is,
independent of who reads it.

While here, rename the AI2018 rule from ProvideExecutionSignalArtifact
to ProvideLearningSignalArtifact for consistency: the class checks
the LEARNING-SIGNAL id, the "Execution" qualifier was redundant under
the new convention (placement is encoded by the array, not the id),
and downstream learning systems aren't necessarily reading only the
execution-side array.

Affects: AI2018 rule class + file + RuleId const + 3 resource
keys/messages, the AI2018 row in docs and skills tables, and the
UNRELEASED BRK bullet for #2934 (whose own "ALAS-SIGNAL example"
becomes "LEARNING-SIGNAL", and which now documents the
class-and-id rename together).

BRK on the just-merged BRK (both still UNRELEASED) — favored over
shipping the consumer name.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stamp ReleaseHistory UNRELEASED section as v5.0.0 (#2937)

build.props was already bumped to <VersionPrefix>5.0.0</VersionPrefix>
in #2924 (the SHA-1 BRK). This finishes the v5.0.0 cut by replacing
the UNRELEASED placeholder header in ReleaseHistory.md with the
canonical version banner (Sdk / Driver / Converters / Multitool /
Multitool Library nuget links), matching the v4.6.4 format.

Picked up by #2936 (the dev to main promotion PR).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trim and split the over-descriptive v5.0.0 notification-taxonomy BRK (#2938)

Per release-notes house style, bullets are one or two self-contained
sentences; PR-description prose belongs in the PR. The original bullet
was ~3x the length of its neighbors and re-litigated the motivation.

Split into two tighter bullets:

  1. Convention change + routing mechanism (id-prefix strip, new
     --config switch, event-kind split).
  2. Rule-table changes (AI1014 removal, AI2018 rename).

Drops the "prefixes were redundant because..." explanation, the wire
value parentheticals (`"execution-notification"` etc.), and the
"ALAS named a specific consumer" parenthetical. The change itself
is visible in the renames; the reader doesn't need the rationale.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix CweGenerateSample.ps1 -GHAzDO crash inside real ADO pipelines (#2940)

The mseng microsoft.sarif-sdk pipeline broke on the first build of main
after the v5.x promotion (build 31555367). Symptom:

  CweGenerateSample.ps1 (args: -Configuration Release -GHAzDO) exited with code 1.
  ADO pipeline context is partially configured. Either populate every
  required variable or clear them all.
  Problems:
    BUILD_DEFINITIONID='1234' disagrees with SYSTEM_DEFINITIONID='9978'
    (both name the same pipeline identifier and must match)

Root cause: the deterministic-fixture env override in CweGenerateSample.ps1
stamps BUILD_DEFINITIONID=1234 for byte-stable output, but does not also
override SYSTEM_DEFINITIONID. ADO agents inject both. The verb's
must-match cross-check in AdoPipelineContext.TryDetect (correctly) refuses
to proceed when the two disagree.

Fix the script (not the verb): add SYSTEM_DEFINITIONID alongside
BUILD_DEFINITIONID in the \ ordered hashtable, plus
SYSTEM_JOBID / SYSTEM_JOBNAME alongside SYSTEM_PHASEID / SYSTEM_PHASENAME
for symmetric hygiene (those pairs are exempt from must-match but the
default-mode \ cleanup loop iterates \ and
benefits from covering the agent's full fallback set). The fixture SARIF
bytes do not change — the primary env vars were already set and are the
ones the verb actually reads.

Regression gate: new
  CweGHAzDoSample_RegenerationSucceeds_WhenAmbientAdoFallbackEnvVarsConflict
[Fact] explicitly seeds SYSTEM_DEFINITIONID / SYSTEM_JOBID / SYSTEM_JOBNAME
with values that disagree with the script's deterministic primaries
before invoking the script. Without the script fix it fails the same way
the mseng build did; with the fix it passes byte-identical.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Refresh v5.0.0 release-history layout + add prefix legend (#2939)

Three changes, all in ReleaseHistory.md:

1. Add a prefix legend at the top of the file. Codifies the six prefixes
   (DEP / BRK / BUG / NEW / PRF / FUN) and the 'BRK leads each section'
   rule. Footnote notes that older sections may predate the convention.

2. Reorder the v5.0.0 section so all BRK bullets lead (BRK -> NEW -> BUG).
   Pure line shuffling; relative order preserved within each group.

3. Normalize the lone 'BUGFIX:' bullet in v4.6.4 to 'BUG:' (matches the
   legend's canonical form). The deep-history 'BUGFIX, BRK:' entry in
   the v1.x section is left alone — that's immutable shipped state.

No code or schema changes; ReleaseHistory.md only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Replace emit-init-run flags with SARIF Run JSON contract

A consumer agent reports the existing 14 typed CLI flags can't express
multiple versionControlProvenance entries (one of which carries a
properties bag documenting skills in play). Modeling every field as a
flag explodes the surface; the peer emit verbs (add-result,
add-notification, add-reporting-descriptor) already accept fully-formed
SARIF JSON via --input or stdin for exactly this reason, with a
documented rationale that applies more strongly to the run header than
to a single result.

Replace the v5.0.0 flag surface on emit-init-run with the same
input/stdin payload contract. EmitInitRunOptions shrinks to three
properties (OutputFilePath, InputFilePath, ForceOverwrite). The
SarifEventReplayer's documented partial-Run shape (tool, language,
columnKind, defaultEncoding, defaultSourceLanguage, originalUriBaseIds,
versionControlProvenance, automationDetails, baselineGuid,
redactionTokens, ...) is now reachable end-to-end through the verb.

Receipt-time validators (no filesystem side-effects on rejection):
required non-empty-string tool.driver.name; https-only
tool.driver.informationUri and versionControlProvenance[].repositoryUri;
https-or-file originalUriBaseIds["SRCROOT"].uri; canonical 8-4-4-4-12
automationDetails.guid/correlationGuid; exact-match ai/origin in
{generated, annotated, synthesized}; SARIF-log-document rejection;
parent-shape JSON-object enforcement at every nested accessor so a
JValue indexer never throws into the broad catch. ADO stamping is now
JToken-direct so producer-supplied SARIF fields outside the SDK typed
Run model survive the wip-line append; the existing typed-Run
materialization at emit-finalize is the documented boundary at which
non-typed fields are dropped, consistent with every other SDK
round-trip.

AdoPipelineContext.ApplyTo(Run) becomes
bool TryApplyTo(Run, out string error). It stamps automationDetails.id
and the four azuredevops/pipeline/build/* properties only when absent
and fails-with-diagnostic on per-field conflict. The previous
unconditional-overwrite contract was inert in v5.0.0 (the flag surface
couldn't supply those fields) but became a footgun once JSON input
could.

CweGenerateSample.ps1 rewrites its emit-init-run call to construct a
PowerShell hashtable -> ConvertTo-Json -Depth 32 -Compress -> stdin
pipe. Both CweSample.sarif and CweGHAzDoSample.sarif regenerate
byte-identically (verified by CweGeneratedSampleTests, which gates the
fixtures sha-256).

skills/emit-sarif-findings/SKILL.md Step 1 is rewritten to show the
JSON construction; the inputs table picks up the multi-VCP and
properties-bag annotations; the package constraint bumps to
Sarif.Multitool >= 5.1.0. docs/multitool-usage.md's flag example is
replaced with the stdin form.

ReleaseHistory.md gets a new v5.1.0 UNRELEASED section with three
bullets: BRK on the flag-surface removal, BRK on
AdoPipelineContext.ApplyTo, NEW on the JSON-payload contract.

Verification:
- dotnet build src/Sarif.Sdk.sln: 0 warnings, 0 errors.
- Test.UnitTests.Sarif.Multitool.Library: 217 passed, 1 skipped.
- Test.UnitTests.Sarif: 896 passed, 3 skipped.
- Test.UnitTests.Sarif.Driver: 140 passed, 1 skipped.
- CweGeneratedSampleTests (3): pass; both fixtures byte-identical.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Stamp ReleaseHistory v5.0.1 section with nuget links

src/build.props was bumped to <VersionPrefix>5.0.1</VersionPrefix>
in be6fb70 (the emit-init-run JSON-contract change). This finishes
the v5.0.1 cut by replacing the UNRELEASED placeholder header in
ReleaseHistory.md with the canonical version banner (Sdk / Driver /
Converters / Multitool / Multitool Library nuget links), matching the
v5.0.0 format. Folded into #2942 so main is shippable the moment the
promotion lands.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Trim v5.0.1 release-notes bullets to neighbor density

The three bullets were 4-6 sentences each, embedding validator catalogs
and finalize-time round-trip prose that belong in the PR description,
not in ReleaseHistory.md. Repo style explicitly calibrates against the
neighbors and asks for trim/split when a bullet exceeds ~3x — the BRK
and NEW bullets here now sit at roughly the same density as the v5.0.0
rename bullets above them.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add multitool add-invocation verb

Mirrors add-result / add-notification / add-reporting-descriptor: takes a
fully-formed SARIF Invocation JSON object via --input <path> or stdin and
appends it to the staged event log as a SarifEventKinds.Invocation event.

SarifEventReplayer strips run.invocations[] carried on the run header, so
this verb is the only path producers have to populate the array. The verb
imposes no schema beyond must be a JSON object (SARIF makes every field on
Invocation optional); full-log shape validation lives in emit-finalize --validate.

AddInvocationOptions / AddInvocationCommand follow the established pattern.
Program.cs registers and dispatches the new verb. SKILL.md, docs/multitool-usage.md,
and ReleaseHistory.md updated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Drop unused System.Text using in EmitInitRunCommandTests

CI's BuildAndTest.ps1 invokes dotnet build with --no-incremental and
/p:EnforceCodeStyleInBuild=true, which surfaces IDE0005 (unused using)
as an error. Local Debug + default incremental builds skipped the check
and let the unused System.Text directive ride into be6fb70.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants