diff --git a/.copilot-tracking/changes/2026-03-12/sarif-github-code-scanning-changes.md b/.copilot-tracking/changes/2026-03-12/sarif-github-code-scanning-changes.md new file mode 100644 index 0000000..d64b753 --- /dev/null +++ b/.copilot-tracking/changes/2026-03-12/sarif-github-code-scanning-changes.md @@ -0,0 +1,57 @@ + +# Release Changes: Improve SARIF Output for GitHub Code Scanning + +**Related Plan**: sarif-github-code-scanning-plan.instructions.md +**Implementation Date**: 2026-03-12 + +## Summary + +Enrich SARIF output so GitHub Code Scanning displays complete inline rule help with WCAG guidance, correct IBM help URLs, properly categorized severity/precision metadata, and enriched result messages for every accessibility alert. + +## Changes + +### Added + +* `src/lib/scanner/result-normalizer.ts` — Added `extractIbmHelpUrl()` helper function that parses IBM Equal Access archive URLs, strips `#` fragments, and falls back to `/archives/latest/` pattern when help field is missing or not a URL +* `src/lib/report/sarif-generator.ts` — Added `buildHelpMarkdown()` function generating rich Markdown rule help with title, description, impact, principle, engine, WCAG tags, and learn-more links +* `src/lib/report/sarif-generator.ts` — Added `buildHelpText()` function generating plain-text rule help as GitHub fallback +* `src/lib/report/sarif-generator.ts` — Added `mapEngineToPrecision()` mapping axe-core → very-high, ibm-equal-access → high, default → medium +* `src/lib/report/sarif-generator.ts` — Added `mapImpactToSeverity()` mapping critical/serious → error, moderate → warning, minor → recommendation + +### Modified + +* `src/lib/scanner/result-normalizer.ts` — Changed `help` field mapping in `normalizeIbmResults()` from `r.help ?? r.message` to `r.message` (IBM `r.help` contains a URL, not text) +* `src/lib/scanner/result-normalizer.ts` — Changed `helpUrl` mapping from broken `/rules/tools/help/` pattern to `extractIbmHelpUrl(r.help, r.ruleId)` using working archive URLs +* `src/lib/report/sarif-generator.ts` — Expanded `SarifRule` interface with `fullDescription`, `help` (text + markdown), `defaultConfiguration`, enriched `properties` (precision, problem.severity) +* `src/lib/report/sarif-generator.ts` — Expanded `SarifRun` interface with `informationUri`, `semanticVersion` on tool.driver and optional `automationDetails` +* `src/lib/report/sarif-generator.ts` — Updated `buildRun()` rule construction to populate all new fields; `shortDescription` changed from `violation.description` to `violation.help` +* `src/lib/report/sarif-generator.ts` — Enriched `SarifResult.message.text` with description, help, scanned URL, selector, element count, and optional failureSummary +* `src/lib/report/sarif-generator.ts` — Added `automationDetails.id` to `buildRun()` return block +* `src/lib/scanner/__tests__/result-normalizer.test.ts` — Updated 2 existing IBM tests, added 3 new tests for archive URL extraction, fallback, and help text separation (48 total) +* `src/lib/report/__tests__/sarif-generator.test.ts` — Updated 2 existing tests, added 11 new tests for enriched fields, tool metadata, failureSummary, site SARIF, and IBM markdown links (24 total) + +### Removed + +* None + +## Additional or Deviating Changes + +* DD-01: `shortDescription.text` changed from `violation.description` to `violation.help` per plan design decision — the concise one-liner is more appropriate for GitHub's brief label display +* DD-02: `failureSummary` included in enriched `message.text` per DR-06 remediation — appended conditionally when present on the node +* Phase 2 subagent pre-added `informationUri` and `semanticVersion` to the return block; Phase 3 only needed to add `automationDetails` + +## Release Summary + +Total files affected: 4 (2 production, 2 test) + +**Created:** None +**Modified:** +* `src/lib/scanner/result-normalizer.ts` — IBM URL fix and `extractIbmHelpUrl()` helper +* `src/lib/report/sarif-generator.ts` — Full SARIF enrichment (interfaces, helpers, rule/result/metadata construction) +* `src/lib/scanner/__tests__/result-normalizer.test.ts` — 48 tests (3 new, 2 updated) +* `src/lib/report/__tests__/sarif-generator.test.ts` — 24 tests (11 new, 2 updated) +**Removed:** None + +**Dependencies:** No new dependencies added +**Infrastructure:** No infrastructure changes +**Deployment notes:** SARIF output format enriched — GitHub Code Scanning will display inline rule help and enriched metadata on next SARIF upload diff --git a/.copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md b/.copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md new file mode 100644 index 0000000..40b7f60 --- /dev/null +++ b/.copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md @@ -0,0 +1,518 @@ + +# Implementation Details: Improve SARIF Output for GitHub Code Scanning + +## Context Reference + +Sources: `.copilot-tracking/research/2026-03-12/sarif-github-code-scanning-research.md`, `src/lib/report/sarif-generator.ts`, `src/lib/scanner/result-normalizer.ts`, `src/lib/types/scan.ts` + +## Implementation Phase 1: Fix IBM Equal Access URL and Help Text + + + +### Step 1.1: Add `extractIbmHelpUrl()` helper function to `result-normalizer.ts` + +Add a helper function above `normalizeIbmResults()` that extracts the base URL (before `#` fragment) from the raw IBM `help` field. The raw IBM `help` field contains a full archive URL like `https://able.ibm.com/rules/archives/2026.03.04/doc/en-US/style_color_misuse.html#...` with an encoded JSON fragment that must be stripped. + +```typescript +function extractIbmHelpUrl(rawHelp: string | undefined, ruleId: string): string { + if (rawHelp) { + try { + const url = new URL(rawHelp); + return `${url.origin}${url.pathname}`; + } catch { + // not a URL, fall through + } + } + return `https://able.ibm.com/rules/archives/latest/doc/en-US/${ruleId}.html`; +} +``` + +Files: +* `src/lib/scanner/result-normalizer.ts` — Add function before `normalizeIbmResults()` (around line 55) + +Discrepancy references: +* Addresses research Discovery 3 Bug A (wrong URL pattern) + +Success criteria: +* Function parses archive URLs correctly and strips fragments +* Function falls back to `/archives/latest/` URL when `rawHelp` is undefined or not a URL +* No regression in existing normalizer behavior + +Context references: +* Research document (Lines 190-210) — IBM helpUrl fix code example +* `src/lib/scanner/result-normalizer.ts` (Lines 60-86) — Current IBM normalizer code + +Dependencies: +* None — standalone helper function + +### Step 1.2: Fix IBM `help` and `helpUrl` field mapping in `normalizeIbmResults()` + +Change two lines in the `.map()` callback of `normalizeIbmResults()`: +- Line 74: Change `help: r.help ?? r.message` to `help: r.message` — since `r.help` is a URL, not text, use `r.message` as the human-readable help text. +- Line 75: Change `helpUrl: \`https://able.ibm.com/rules/tools/help/${r.ruleId}\`` to `helpUrl: extractIbmHelpUrl(r.help, r.ruleId)` — use the working archive URL. + +Files: +* `src/lib/scanner/result-normalizer.ts` — Modify lines 74-75 + +Discrepancy references: +* Addresses research Discovery 3 (both Bug A and Bug B) +* Addresses research Discovery 4 (IBM `help` field is URL, not text) + +Success criteria: +* IBM normalized violations have `help` set to the human-readable message text +* IBM normalized violations have `helpUrl` set to the working archive URL +* URLs do not contain `#` fragments with encoded JSON + +Context references: +* Research document (Lines 190-215) — Fix code example +* `src/lib/scanner/result-normalizer.ts` (Lines 72-76) — Current mapping + +Dependencies: +* Step 1.1 (extractIbmHelpUrl function) + +### Step 1.3: Update IBM normalizer tests for new helpUrl and help text behavior + +Update existing tests and add new ones in `result-normalizer.test.ts`: + +1. **Update existing test** "constructs IBM helpUrl from ruleId" — change expected URL from `https://able.ibm.com/rules/tools/help/img_alt_valid` to the archive URL pattern. +2. **Update existing test** "maps message to description and help" — when `help` is a URL, the normalized `help` should be `r.message`, not the URL. +3. **Add new test** "extracts base URL from IBM help field" — provide a raw IBM `help` URL with fragment, verify `helpUrl` strips the fragment. +4. **Add new test** "falls back to archive URL when help is not a URL" — provide non-URL help, verify fallback. +5. **Update existing test** "falls back to message when help is not provided" — verify `help` is `r.message`. + +Files: +* `src/lib/scanner/__tests__/result-normalizer.test.ts` — Update and add tests in the `normalizeIbmResults` describe block + +Discrepancy references: +* Validates fixes for DR items related to IBM URLs + +Success criteria: +* All IBM normalizer tests pass +* New tests cover archive URL extraction, fragment stripping, and fallback behavior +* Test for `help` field confirms it contains human-readable text, not a URL + +Context references: +* `src/lib/scanner/__tests__/result-normalizer.test.ts` (Lines 68-180) — Existing IBM tests +* Research document (Lines 220-245) — IBM help field analysis + +Dependencies: +* Steps 1.1 and 1.2 completion + +## Implementation Phase 2: Enrich SARIF Rule Descriptors + + + +### Step 2.1: Update `SarifRule` interface with all GitHub-supported fields + +Expand the `SarifRule` interface in `sarif-generator.ts` to include all fields that GitHub Code Scanning supports and renders: + +```typescript +interface SarifRule { + id: string; + name: string; + shortDescription: { text: string }; + fullDescription: { text: string }; + helpUri: string; + help: { + text: string; + markdown: string; + }; + defaultConfiguration: { + level: 'error' | 'warning' | 'note'; + }; + properties: { + tags: string[]; + precision: 'very-high' | 'high' | 'medium' | 'low'; + 'problem.severity': 'error' | 'warning' | 'recommendation'; + }; +} +``` + +Also update `SarifRun` to add `informationUri` and `semanticVersion` to `tool.driver`, and add `automationDetails` to the run: + +```typescript +interface SarifRun { + tool: { + driver: { + name: string; + version: string; + informationUri: string; + semanticVersion: string; + rules: SarifRule[]; + }; + }; + automationDetails?: { id: string }; + results: SarifResult[]; +} +``` + +Files: +* `src/lib/report/sarif-generator.ts` — Modify `SarifRule` interface (lines 20-26), `SarifRun` interface (lines 9-18) + +Discrepancy references: +* Addresses research Discovery 1 (missing `help.text`/`help.markdown`) +* Addresses research Discovery 6 (missing `fullDescription.text`) + +Success criteria: +* Interface includes all GitHub-required and recommended properties +* TypeScript compilation succeeds with the updated interface + +Context references: +* Research document (Lines 105-130) — Target SarifRule interface +* `src/lib/report/sarif-generator.ts` (Lines 9-26) — Current interfaces + +Dependencies: +* None — interface change only + +### Step 2.2: Add `buildHelpMarkdown()` function + +Create a function that generates rich Markdown help content from an `AxeViolation`. This content is displayed in GitHub's "Rule help" panel when a developer clicks an alert. + +```typescript +function buildHelpMarkdown(violation: AxeViolation): string { + const lines: string[] = [ + `# ${violation.help}`, + '', + violation.description, + '', + `**Impact:** ${violation.impact}`, + ]; + + if (violation.principle) { + lines.push(`**Principle:** ${violation.principle}`); + } + + if (violation.engine) { + lines.push(`**Engine:** ${violation.engine}`); + } + + const wcagTags = violation.tags.filter(t => /^wcag\d/.test(t)); + if (wcagTags.length > 0) { + lines.push('', '## WCAG Criteria', ''); + for (const tag of wcagTags) { + lines.push(`- \`${tag}\``); + } + } + + lines.push('', '## Learn More', ''); + if (violation.helpUrl) { + lines.push(`- [Rule documentation](${violation.helpUrl})`); + } + + return lines.join('\n'); +} +``` + +Files: +* `src/lib/report/sarif-generator.ts` — Add function after `urlToArtifactPath()` (around line 78) + +Discrepancy references: +* Addresses research Discovery 1 — core fix for "no rule help available" +* Addresses research Discovery 2 — embeds helpUrl as markdown link instead of relying on `helpUri` + +Success criteria: +* Generated markdown includes title, description, impact, WCAG tags, and learn more link +* Markdown renders correctly in GitHub's rule help panel +* Handles violations with no `principle`, no `engine`, and no WCAG tags gracefully + +Context references: +* Research document (Lines 225-255) — `buildHelpMarkdown` code example +* Research document (Lines 135-175) — Ideal SARIF rule JSON example + +Dependencies: +* `AxeViolation` type import (already present) + +### Step 2.3: Add `buildHelpText()` function + +Create a plain-text version of the rule help content. GitHub requires `help.text` and falls back to it when `help.markdown` is not supported. + +```typescript +function buildHelpText(violation: AxeViolation): string { + const parts: string[] = [ + violation.help, + '', + violation.description, + '', + `Impact: ${violation.impact}`, + ]; + + if (violation.principle) { + parts.push(`Principle: ${violation.principle}`); + } + + const wcagTags = violation.tags.filter(t => /^wcag\d/.test(t)); + if (wcagTags.length > 0) { + parts.push('', 'WCAG Criteria:'); + for (const tag of wcagTags) { + parts.push(` - ${tag}`); + } + } + + if (violation.helpUrl) { + parts.push('', `Learn more: ${violation.helpUrl}`); + } + + return parts.join('\n'); +} +``` + +Files: +* `src/lib/report/sarif-generator.ts` — Add function after `buildHelpMarkdown()` (around line 110) + +Discrepancy references: +* Addresses research Discovery 1 — `help.text` is Required by GitHub + +Success criteria: +* Plain text includes the same information as markdown without formatting +* No markdown syntax in the output + +Context references: +* Research document (Lines 105-130) — Target interface shows `help.text` requirement + +Dependencies: +* `AxeViolation` type import (already present) + +### Step 2.4: Add mapping functions for `defaultConfiguration.level`, `precision`, and `problem.severity` + +Add two small mapping functions: + +```typescript +function mapEngineToPrecision(engine?: string): 'very-high' | 'high' | 'medium' | 'low' { + switch (engine) { + case 'axe-core': + return 'very-high'; + case 'ibm-equal-access': + return 'high'; + default: + return 'medium'; + } +} + +function mapImpactToSeverity(impact: string): 'error' | 'warning' | 'recommendation' { + switch (impact) { + case 'critical': + case 'serious': + return 'error'; + case 'moderate': + return 'warning'; + case 'minor': + default: + return 'recommendation'; + } +} +``` + +Files: +* `src/lib/report/sarif-generator.ts` — Add functions after `buildHelpText()` (around line 135) + +Discrepancy references: +* None — directly implements user requirement for precision/severity properties + +Success criteria: +* `mapEngineToPrecision` returns `very-high` for axe, `high` for IBM, `medium` for others +* `mapImpactToSeverity` returns `error`/`warning`/`recommendation` matching `mapImpactToLevel` pattern +* Existing `mapImpactToLevel` function is reused for `defaultConfiguration.level` + +Context references: +* Research document (Lines 105-130) — Target properties +* `src/lib/report/sarif-generator.ts` (Lines 49-59) — Existing `mapImpactToLevel` + +Dependencies: +* None — standalone mapping functions + +### Step 2.5: Update `buildRun()` to populate all new fields on each rule + +Modify the `buildRun()` function to populate the expanded `SarifRule` fields in the rule construction block: + +```typescript +const rule: SarifRule = { + id: violation.id, + name: violation.id, + shortDescription: { text: violation.help }, + fullDescription: { text: violation.description }, + helpUri: violation.helpUrl, + help: { + text: buildHelpText(violation), + markdown: buildHelpMarkdown(violation), + }, + defaultConfiguration: { + level: mapImpactToLevel(violation.impact), + }, + properties: { + tags: violation.tags, + precision: mapEngineToPrecision(violation.engine), + 'problem.severity': mapImpactToSeverity(violation.impact), + }, +}; +``` + +Note: `shortDescription.text` changes from `violation.description` to `violation.help` — the `.help` field is the short one-liner (e.g., "Ensure contrast ratio is sufficient") while `.description` is the longer explanation. The current code uses `description` for short and omits the full description entirely. + +Files: +* `src/lib/report/sarif-generator.ts` — Modify the rule construction in `buildRun()` (lines 91-97) + +Discrepancy references: +* Implements research Discovery 1, 2, 5, 6 together in the rule builder + +Success criteria: +* Every rule in the SARIF output has `fullDescription.text`, `help.text`, `help.markdown`, `defaultConfiguration.level`, `properties.precision`, and `properties.problem.severity` +* `shortDescription` uses `violation.help` (concise) and `fullDescription` uses `violation.description` (detailed) +* TypeScript compiles without errors + +Context references: +* `src/lib/report/sarif-generator.ts` (Lines 86-97) — Current rule construction +* Research document (Lines 135-175) — Ideal SARIF rule JSON + +Dependencies: +* Steps 2.1-2.4 (interface, helpers, mappers) + +## Implementation Phase 3: Enrich SARIF Results and Tool Metadata + + + +### Step 3.1: Enrich `SarifResult.message.text` with description, URL, selector, and element count + +Update the result message construction in `buildRun()` to provide a more information-dense first sentence. GitHub shows the first line of `message.text` as the alert summary in the list view. + +Current (line 105): +```typescript +message: { text: `${violation.help} (${url} — ${target})` }, +``` + +Target: +```typescript +message: { + text: `${violation.description}: ${violation.help}. Scanned URL: ${url} — Selector: ${target} — ${violation.nodes.length} element(s) affected${node.failureSummary ? ` — ${node.failureSummary}` : ''}`, +}, +``` + +Files: +* `src/lib/report/sarif-generator.ts` — Modify message construction in `buildRun()` (around line 105) + +Discrepancy references: +* Addresses research Discovery 5 — rich data available but lost + +Success criteria: +* Result message includes violation description, help text, scanned URL, CSS selector, element count, and `failureSummary` when available +* First sentence of message is descriptive enough for the alert list view + +Context references: +* Research document (Lines 197-220) — Ideal SARIF result example +* `src/lib/report/sarif-generator.ts` (Lines 99-116) — Current result construction + +Dependencies: +* Phase 2 completion (interface changes) + +### Step 3.2: Add `tool.driver.informationUri`, `tool.driver.semanticVersion`, and `automationDetails.id` + +Update the return object of `buildRun()` to include additional tool metadata: + +```typescript +return { + tool: { + driver: { + name: 'accessibility-scanner', + version: toolVersion, + informationUri: 'https://github.com/devopsabcs-engineering/accessibility-scan-demo-app', + semanticVersion: toolVersion, + rules, + }, + }, + automationDetails: { + id: `accessibility-scan/${url}`, + }, + results, +}; +``` + +Files: +* `src/lib/report/sarif-generator.ts` — Modify the return block of `buildRun()` (lines 118-128) + +Discrepancy references: +* Directly addresses user requirement for tool identification + +Success criteria: +* `tool.driver.informationUri` points to the GitHub repository +* `tool.driver.semanticVersion` matches the version string +* `automationDetails.id` uniquely identifies the scan run + +Context references: +* Research document (Lines 105-130) — Target SarifRun interface +* `src/lib/report/sarif-generator.ts` (Lines 118-128) — Current return block + +Dependencies: +* Step 2.1 (SarifRun interface update) + +### Step 3.3: Update SARIF generator tests for enriched messages and tool metadata + +Add and update tests in `sarif-generator.test.ts`: + +1. **Update** "includes tool driver information" — assert `informationUri` and `semanticVersion` are present. +2. **Add** "includes automationDetails with scan URL" — verify `automationDetails.id` contains the scanned URL. +3. **Add** "rule includes fullDescription" — verify `fullDescription.text` is set from `violation.description`. +4. **Add** "rule includes help.text and help.markdown" — verify both are present and non-empty. +5. **Add** "rule help.markdown contains WCAG tags" — provide a violation with WCAG tags, verify they appear in markdown. +6. **Add** "rule includes defaultConfiguration.level" — verify level maps from impact. +7. **Add** "rule properties include precision and problem.severity" — verify both are set correctly. +8. **Update** "includes scanned URL in result message" — update expected message format to match enriched format. +9. **Add** "result message includes element count" — verify the element count appears in the message. +10. **Add** "rule help.markdown contains learn more link" — verify the helpUrl is embedded as a markdown link. +11. **Add** "result message includes failureSummary when present" — provide a violation with `failureSummary` and verify it appears in the message. +12. **Add** "generateSiteSarif includes enriched rule fields" — verify that `generateSiteSarif` output contains `fullDescription` and `help` on rules. +13. **Add** "IBM rule IDs with underscores produce valid markdown links" — verify that a violation with id `label_name_visible` and a helpUrl containing underscores produces an unescaped markdown link in `help.markdown`. + +Files: +* `src/lib/report/__tests__/sarif-generator.test.ts` — Add and update tests + +Discrepancy references: +* Validates all enrichment changes from Phases 2 and 3 + +Success criteria: +* All new and updated tests pass +* Tests cover all GitHub-required fields (`fullDescription`, `help.text`, `help.markdown`) +* Tests cover all GitHub-recommended fields (`precision`, `problem.severity`, `defaultConfiguration`) +* Tests verify tool metadata (`informationUri`, `semanticVersion`, `automationDetails`) + +Context references: +* `src/lib/report/__tests__/sarif-generator.test.ts` (Lines 1-110) — Existing test suite +* Research document (Lines 135-220) — Expected SARIF output examples + +Dependencies: +* Steps 3.1 and 3.2 completion + +## Implementation Phase 4: Validation + + + +### Step 4.1: Run full project validation + +Execute all validation commands for the project: +* `npm run lint` — ESLint across the project +* `npm run build` — Next.js production build to catch type errors +* `npm run test` — Full Vitest test suite + +### Step 4.2: Fix minor validation issues + +Iterate on lint errors, build warnings, and test failures. Apply fixes directly when corrections are straightforward and isolated. + +### Step 4.3: Report blocking issues + +When validation failures require changes beyond minor fixes: +* Document the issues and affected files. +* Provide the user with next steps. +* Recommend additional research and planning rather than inline fixes. +* Avoid large-scale refactoring within this phase. + +## Dependencies + +* Node.js and npm (project build and test toolchain) +* Vitest (test runner) +* ESLint (linter) +* Next.js (build toolchain) + +## Success Criteria + +* GitHub Code Scanning displays inline rule help for every accessibility alert +* IBM rule URLs resolve to working archive pages +* All existing and new tests pass +* Lint and build produce no errors diff --git a/.copilot-tracking/plans/2026-03-12/sarif-github-code-scanning-plan.instructions.md b/.copilot-tracking/plans/2026-03-12/sarif-github-code-scanning-plan.instructions.md new file mode 100644 index 0000000..4280b2f --- /dev/null +++ b/.copilot-tracking/plans/2026-03-12/sarif-github-code-scanning-plan.instructions.md @@ -0,0 +1,129 @@ +--- +applyTo: '.copilot-tracking/changes/2026-03-12/sarif-github-code-scanning-changes.md' +--- + +# Implementation Plan: Improve SARIF Output for GitHub Code Scanning + +## Overview + +Enrich the SARIF output produced by the accessibility scanner so that GitHub Code Scanning displays complete, inline rule help with WCAG guidance, correct IBM help URLs, and properly categorized severity/precision metadata for every accessibility alert. + +## Objectives + +### User Requirements + +* Add `fullDescription`, `help` (with `text` and `markdown`), and `defaultConfiguration` to SARIF rule descriptors so GitHub shows "Rule help" content inline — Source: task request +* Fix broken IBM Equal Access `helpUri` URLs (wrong URL pattern in normalizer and `help` field containing URL instead of text) — Source: task request +* Enrich SARIF result `message.text` with structured content including violation details, affected snippet, selector, failure summary (`failureSummary`), element count, and WCAG criteria — Source: task request +* Ensure SARIF `properties` carry `precision` and `problem.severity` so GitHub can categorize, filter, and order results — Source: task request +* Add `tool.driver.informationUri`, `tool.driver.semanticVersion`, and `automationDetails.id` for proper tool identification — Source: task request + +### Derived Objectives + +* Create `buildHelpMarkdown()` and `buildHelpText()` helper functions to generate rule help content from `AxeViolation` data — Derived from: all rules need `help.text` and `help.markdown`, which require a structured builder +* Add `extractIbmHelpUrl()` helper to extract base URL from raw IBM `help` field — Derived from: IBM URL fix requires stripping the `#fragment` with encoded JSON from the archive URL +* Update existing SARIF and normalizer tests to cover all new fields and IBM URL fix — Derived from: existing test suite must remain green and cover new behavior +* Keep `helpUri` on the SARIF rule for spec compliance even though GitHub does not render it — Derived from: SARIF v2.1.0 spec compliance; removing it would be non-standard + +## Context Summary + +### Project Files + +* `src/lib/report/sarif-generator.ts` — The SARIF generator (142 lines). Produces `SarifRule` with only 5 fields. Missing: `fullDescription`, `help`, `defaultConfiguration`, `precision`, `problem.severity`. +* `src/lib/scanner/result-normalizer.ts` — IBM normalizer. Line 74 maps `r.help` (a URL) to text `help` field. Line 75 constructs a wrong helpUrl pattern. +* `src/lib/types/scan.ts` — `AxeViolation` interface (lines 33–44). Has `description`, `help`, `helpUrl`, `nodes`, `principle`, `engine`, `tags`, `impact`. +* `src/lib/report/__tests__/sarif-generator.test.ts` — 13 existing tests covering SARIF generation. +* `src/lib/scanner/__tests__/result-normalizer.test.ts` — 20+ existing tests covering IBM normalization. +* `src/components/ViolationList.tsx` — HTML report renders all rich data; this is the quality target for SARIF. + +### References + +* `.copilot-tracking/research/2026-03-12/sarif-github-code-scanning-research.md` — Primary research document with full gap analysis, code examples, and alternative evaluation. +* GitHub SARIF Support: `https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning` +* SARIF v2.1.0 OASIS Spec: `https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html` + +### Standards References + +* #file:../../.github/instructions/a11y-remediation.instructions.md — Accessibility remediation patterns and fix prioritization +* #file:../../.github/instructions/wcag22-rules.instructions.md — WCAG 2.2 Level AA compliance rules +* #file:../../.github/instructions/ado-workflow.instructions.md — ADO workflow with branching, commit messages, and PR conventions + +## Implementation Checklist + +### [x] Implementation Phase 1: Fix IBM Equal Access URL and Help Text + + + +* [x] Step 1.1: Add `extractIbmHelpUrl()` helper function to `result-normalizer.ts` + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 15-46) +* [x] Step 1.2: Fix IBM `help` and `helpUrl` field mapping in `normalizeIbmResults()` + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 48-72) +* [x] Step 1.3: Update IBM normalizer tests for new helpUrl and help text behavior + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 74-107) +* [x] Step 1.4: Validate phase changes + * Run `npm run lint` and `npm run test -- src/lib/scanner/__tests__/result-normalizer.test.ts` + +### [x] Implementation Phase 2: Enrich SARIF Rule Descriptors + + + +* [x] Step 2.1: Update `SarifRule` interface with all GitHub-supported fields + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 113-147) +* [x] Step 2.2: Add `buildHelpMarkdown()` function + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 149-197) +* [x] Step 2.3: Add `buildHelpText()` function + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 199-230) +* [x] Step 2.4: Add mapping functions for `defaultConfiguration.level`, `precision`, and `problem.severity` + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 232-268) +* [x] Step 2.5: Update `buildRun()` to populate all new fields on each rule + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 270-312) +* [x] Step 2.6: Validate phase changes + * Run `npm run lint` and `npm run test -- src/lib/report/__tests__/sarif-generator.test.ts` + +### [x] Implementation Phase 3: Enrich SARIF Results and Tool Metadata + + + + +* [x] Step 3.1: Enrich `SarifResult.message.text` with description, URL, selector, and element count + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 318-348) +* [x] Step 3.2: Add `tool.driver.informationUri`, `tool.driver.semanticVersion`, and `automationDetails.id` + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 350-389) +* [x] Step 3.3: Update SARIF generator tests for enriched messages and tool metadata + * Details: .copilot-tracking/details/2026-03-12/sarif-github-code-scanning-details.md (Lines 391-442) +* [x] Step 3.4: Validate phase changes + * Run `npm run lint` and `npm run test -- src/lib/report/__tests__/sarif-generator.test.ts` + +### [x] Implementation Phase 4: Validation + + + +* [x] Step 4.1: Run full project validation + * Execute `npm run lint` + * Execute `npm run build` + * Execute `npm run test` +* [x] Step 4.2: Fix minor validation issues + * No issues found — all validation passed cleanly +* [x] Step 4.3: Report blocking issues + * No blocking issues found + +## Planning Log + +See [sarif-github-code-scanning-log.md](../logs/2026-03-12/sarif-github-code-scanning-log.md) for discrepancy tracking, implementation paths considered, and suggested follow-on work. + +## Dependencies + +* Node.js and npm (project build and test toolchain) +* Vitest (test runner — `npm run test`) +* ESLint (linting — `npm run lint`) +* Next.js build (`npm run build`) +* No new dependencies required — all enrichment uses existing `AxeViolation` data + +## Success Criteria + +* GitHub Code Scanning displays inline "Rule help" with description, WCAG mapping, remediation guidance, and learn more links for every accessibility alert — Traces to: user requirement (rule help) + research Discovery 1 +* IBM rule links resolve correctly using the archive URL pattern extracted from raw IBM data — Traces to: user requirement (IBM URLs) + research Discovery 3 +* Result messages include violation description, scanned URL, selector, affected element count, and `failureSummary` when available — Traces to: user requirement (enriched messages) + research Discovery 5 +* Tags and properties (`precision`, `problem.severity`) enable filtering by WCAG principle and severity ordering — Traces to: user requirement (properties) + research GitHub SARIF Support +* `tool.driver.informationUri` and `tool.driver.semanticVersion` present on the tool driver — Traces to: user requirement (tool identification) +* All existing tests pass and new tests cover the added fields — Traces to: derived objective (test coverage) diff --git a/.copilot-tracking/research/2026-03-12/sarif-github-code-scanning-research.md b/.copilot-tracking/research/2026-03-12/sarif-github-code-scanning-research.md new file mode 100644 index 0000000..d56f626 --- /dev/null +++ b/.copilot-tracking/research/2026-03-12/sarif-github-code-scanning-research.md @@ -0,0 +1,411 @@ + +# Task Research: Improve SARIF Output for GitHub Code Scanning + +Enhance the SARIF output produced by the accessibility-scanner so that results displayed in GitHub Security > Code Scanning are complete, rich, and useful for developers — matching the quality of commercial SAST tools like CodeQL. + +## Task Implementation Requests + +* Add `fullDescription`, `help` (with `text` and `markdown`), and `defaultConfiguration` to SARIF rule descriptors so GitHub shows "Rule help" content inline. +* Fix broken IBM Equal Access `helpUri` URLs — two sub-bugs: wrong URL pattern in normalizer, and potential underscore-escaping. +* Enrich SARIF result `message.text` with structured content including violation details, affected snippet, selector, failure summary, and WCAG criteria. +* Ensure SARIF `properties` carry proper tags (`precision`, `problem.severity`) so GitHub can categorize/filter/order results. +* Add `tool.driver.informationUri`, `tool.driver.semanticVersion`, and `automationDetails.id` for proper tool identification. + +## Scope and Success Criteria + +* Scope: SARIF generator (`src/lib/report/sarif-generator.ts`), result normalizer (`src/lib/scanner/result-normalizer.ts`), and supporting types. Does not cover the web UI, PDF, or HTML report. +* Assumptions: + * GitHub Code Scanning uses a **specific subset** of SARIF v2.1.0 — not all spec fields are rendered. + * `helpUri` is **NOT supported by GitHub** — help URLs must be embedded in `help.markdown`. + * Existing `AxeViolation` and `AxeNode` fields contain all data needed for enrichment. + * IBM raw `help` field contains a valid archive URL that should be used directly. +* Success Criteria: + * GitHub displays inline "Rule help" with description, WCAG mapping, remediation guidance, and learn more links for every accessibility alert. + * IBM rule links resolve correctly (use archive URL pattern, not the `/tools/help/` pattern). + * Result messages include violation description, affected element count, and scanned URL. + * Tags/properties enable filtering by WCAG principle and severity ordering. + +## Outline + +1. Current SARIF output analysis — complete field inventory and gaps +2. GitHub SARIF ingestion — exact fields that enable rich display +3. IBM Equal Access URL issue — root cause (two bugs) and fix +4. Selected approach — enriched SARIF with all GitHub-supported fields +5. Implementation plan with code examples +6. Considered alternatives + +## Potential Next Research + +* Validate with a test SARIF upload — confirm `help.markdown` renders as expected. +* Research whether IBM `/rules/tools/help/{ruleId}` endpoint redirects or is deprecated. +* Investigate whether `result.message.markdown` is silently supported by GitHub (not documented). +* Consider adding WCAG success criterion text directly to `help.markdown` using a static mapping table. + +## Research Executed + +### File Analysis + +* `src/lib/report/sarif-generator.ts` (142 lines) — The entire SARIF generator. Produces `SarifRule` with only 5 fields (`id`, `name`, `shortDescription`, `helpUri`, `properties.tags`). Missing: `fullDescription`, `help`, `defaultConfiguration`, precision/severity properties. +* `src/lib/scanner/result-normalizer.ts` (lines 60–86) — IBM normalizer. Line 74 maps `r.help` (actually a URL) to the text `help` field. Line 75 constructs a different `helpUrl` using `/rules/tools/help/` pattern instead of the raw IBM archive URL. +* `src/lib/types/scan.ts` (lines 32–65) — `AxeViolation` has fields `description`, `help`, `helpUrl`, `nodes`, `principle`, `engine`. `AxeNode` has `html`, `target`, `impact`, `failureSummary`. `failureSummary`, `principle`, and `engine` never reach SARIF. +* `src/components/ViolationList.tsx` — HTML report renders all rich data including impact badges, description, snippet, selector, failure summary, element count, and learn more links. This is the quality target for SARIF. +* `results/https_/example.com.json` (lines 1–40) — Raw IBM results. The `help` field contains full archive URLs like `https://able.ibm.com/rules/archives/2026.03.04/doc/en-US/style_color_misuse.html#...` — these work correctly. + +### Code Search Results + +* `helpUri` appears only in `sarif-generator.ts` — set from `violation.helpUrl`. +* `able.ibm.com` appears in `result-normalizer.ts:75` (constructed URL) and raw result files (archive URL). +* `fullDescription`, `help.text`, `help.markdown` never appear in the codebase — confirming they are missing. + +### External Research + +* GitHub SARIF Support: `https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning` + * `fullDescription.text` — **Required** by GitHub. + * `help.text` — **Required** by GitHub. This is the "Rule help" panel. + * `help.markdown` — **Recommended**. When present, **displayed instead of** `help.text`. + * `helpUri` — **NOT listed** in GitHub's supported properties. GitHub does not render it. + * `properties.precision` — **Recommended**. Affects result ordering. + * `properties.problem.severity` — **Recommended**. Affects result ordering. +* SARIF v2.1.0 OASIS Spec: `https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html` + * §3.49.13: `help` is a `multiformatMessageString` with `text` (required) and `markdown` (optional). + * §3.49.12: `helpUri` is valid SARIF but not guaranteed to be displayed by any viewer. + +### Project Conventions + +* Standards referenced: SARIF v2.1.0 OASIS spec, GitHub SARIF support docs, CodeQL SARIF patterns +* Instructions followed: `ado-workflow.instructions.md`, `a11y-remediation.instructions.md`, `wcag22-rules.instructions.md` + +## Key Discoveries + +### Discovery 1: Root Cause of "No Rule Help Available" + +GitHub Code Scanning requires `help.text` and recommends `help.markdown` on every `reportingDescriptor` (rule). The current generator produces neither. This is why **every** accessibility alert shows "no rule help available for this alert." + +**Evidence**: GitHub docs explicitly state `help.text` is Required. Current `SarifRule` interface at [sarif-generator.ts](src/lib/report/sarif-generator.ts#L20-L26) only has `id`, `name`, `shortDescription`, `helpUri`, `properties`. + +### Discovery 2: `helpUri` Is Ignored by GitHub + +GitHub's supported properties documentation does **not list `helpUri`** at all. The current tool relies on `helpUri` for "Learn more" links, but GitHub never renders them. The fix is to embed help URLs as markdown links inside `help.markdown`. + +**Evidence**: GitHub SARIF support docs — `helpUri` absent from the `reportingDescriptor` supported properties table. + +### Discovery 3: IBM Equal Access URL — Two Bugs + +**Bug A — Wrong URL pattern**: `normalizeIbmResults()` at [result-normalizer.ts](src/lib/scanner/result-normalizer.ts#L75) constructs `https://able.ibm.com/rules/tools/help/${r.ruleId}` — a generic endpoint that may not exist. The raw IBM data contains a working archive URL in the `help` field: `https://able.ibm.com/rules/archives/2026.03.04/doc/en-US/{ruleId}.html#...`. + +**Bug B — Underscore markdown escaping**: Somewhere in the pipeline, underscores in rule IDs (e.g., `label_name_visible`) are backslash-escaped to `label\_name\_visible`, producing 404s. This likely happens when GitHub renders `helpUri` or SARIF content through a markdown processor. + +**Fix**: Extract the base URL (before `#` fragment) from the raw IBM `help` field and use it as the canonical help URL. Embed it in `help.markdown` instead of relying on `helpUri`. + +### Discovery 4: IBM `help` Field Is a URL, Not Text + +The normalizer at [result-normalizer.ts](src/lib/scanner/result-normalizer.ts#L74) does `help: r.help ?? r.message`. But `r.help` in the raw IBM data is a **URL** (e.g., `https://able.ibm.com/rules/archives/...`), not a text description. This means the `help` field of the normalized violation contains a URL string instead of human-readable guidance for IBM rules. + +**Evidence**: [example.com.json](results/https_/example.com.json) line 29 — the `help` field is a full URL with encoded JSON fragment. + +### Discovery 5: Rich Data Available But Lost + +The HTML report (ViolationList component) displays: `failureSummary`, CSS selectors (`target`), element count, principle grouping, and engine source. None of these flow into SARIF output. The SARIF `message.text` is a simple concatenation: `"{help} ({url} — {target})"`. + +### Discovery 6: `fullDescription.text` Is Missing + +GitHub marks `fullDescription.text` as **Required**. The current generator only sets `shortDescription.text` from `violation.description`. `fullDescription` is not set at all. + +### Discovery 7: GitHub Upload Limits + +| Limit | Value | +|---|---| +| File size (gzip) | 10 MB | +| Runs per file | 20 | +| Results per run | 25,000 (top 5,000 shown) | +| Rules per run | 25,000 | +| Tags per rule | 20 (10 shown) | + +### Implementation Patterns + +#### Current SarifRule Interface (5 fields) + +```typescript +// sarif-generator.ts lines 20-26 +interface SarifRule { + id: string; + name: string; + shortDescription: { text: string }; + helpUri: string; + properties: { tags: string[] }; +} +``` + +#### Target SarifRule Interface (all GitHub-supported fields) + +```typescript +interface SarifRule { + id: string; + name: string; + shortDescription: { text: string }; + fullDescription: { text: string }; + helpUri: string; // keep for SARIF compliance, but not rendered by GitHub + help: { + text: string; // required by GitHub — plain text rule documentation + markdown: string; // recommended — rich markdown, displayed instead of text + }; + defaultConfiguration: { + level: 'error' | 'warning' | 'note'; + }; + properties: { + tags: string[]; + precision: 'very-high' | 'high' | 'medium' | 'low'; + 'problem.severity': 'error' | 'warning' | 'recommendation'; + }; +} +``` + +### Complete Examples + +#### Ideal SARIF Rule for an Accessibility Violation + +```json +{ + "id": "color-contrast", + "name": "color-contrast", + "shortDescription": { + "text": "Ensure the contrast between foreground and background colors meets WCAG 2 AA thresholds" + }, + "fullDescription": { + "text": "Ensures the contrast between foreground and background colors meets WCAG 2 AA minimum contrast ratio thresholds. Low contrast text is difficult or impossible for many users to read." + }, + "help": { + "text": "Ensure sufficient color contrast\n\nElements must meet minimum color contrast ratio thresholds.\n\nFix: Increase the contrast ratio between foreground and background colors. Use at least 4.5:1 for normal text and 3:1 for large text.\n\nWCAG: 1.4.3 Contrast (Minimum) (Level AA)\n\nLearn more: https://dequeuniversity.com/rules/axe/4.10/color-contrast", + "markdown": "# Ensure sufficient color contrast\n\nElements must meet minimum color contrast ratio thresholds.\n\n## Why This Matters\n\nLow contrast text is difficult or impossible to read for many users, including those with low vision, color blindness, or age-related vision changes.\n\n## How to Fix\n\n- Increase the contrast ratio between foreground and background colors\n- Use a contrast ratio of at least **4.5:1** for normal text\n- Use a contrast ratio of at least **3:1** for large text (18pt or 14pt bold)\n\n## WCAG Criteria\n\n- [1.4.3 Contrast (Minimum) (Level AA)](https://www.w3.org/WAI/WCAG22/Understanding/contrast-minimum.html)\n\n## Learn More\n\n- [Deque University: color-contrast](https://dequeuniversity.com/rules/axe/4.10/color-contrast)\n" + }, + "helpUri": "https://dequeuniversity.com/rules/axe/4.10/color-contrast", + "defaultConfiguration": { + "level": "error" + }, + "properties": { + "tags": ["accessibility", "WCAG2AA", "wcag143"], + "precision": "very-high", + "problem.severity": "error" + } +} +``` + +#### Ideal SARIF Result + +```json +{ + "ruleId": "color-contrast", + "ruleIndex": 0, + "level": "error", + "message": { + "text": "Ensure sufficient color contrast: Elements must meet minimum color contrast ratio thresholds. Scanned URL: https://example.com — Selector: button.btn-secondary — 3 elements affected" + }, + "locations": [ + { + "physicalLocation": { + "artifactLocation": { "uri": "example.com/index" }, + "region": { + "startLine": 1, + "startColumn": 1, + "snippet": { + "text": "" + } + } + } + } + ], + "partialFingerprints": { + "primaryLocationLineHash": "a1b2c3d4" + } +} +``` + +### Configuration Examples + +#### IBM helpUrl Fix in result-normalizer.ts + +```typescript +// Extract base URL from IBM help field (strip fragment with encoded JSON context) +function extractIbmHelpUrl(rawHelp: string | undefined): string { + if (!rawHelp) return ''; + try { + const url = new URL(rawHelp); + return `${url.origin}${url.pathname}`; // strip #fragment + } catch { + return rawHelp; + } +} + +// In normalizeIbmResults(): +helpUrl: extractIbmHelpUrl(r.help) || `https://able.ibm.com/rules/archives/latest/doc/en-US/${r.ruleId}.html`, +help: r.message, // Use the message text, not the URL, as the help text +``` + +#### help.markdown Build Function + +```typescript +function buildHelpMarkdown(violation: AxeViolation): string { + const wcagTags = violation.tags.filter(t => /^wcag\d/.test(t)); + const lines: string[] = [ + `# ${violation.help}`, + '', + violation.description, + '', + `**Impact:** ${violation.impact}`, + ]; + + if (violation.principle) { + lines.push(`**Principle:** ${violation.principle}`); + } + + if (wcagTags.length > 0) { + lines.push('', '## WCAG Criteria', ''); + for (const tag of wcagTags) { + lines.push(`- \`${tag}\``); + } + } + + lines.push('', '## Learn More', ''); + lines.push(`- [Rule documentation](${violation.helpUrl})`); + + return lines.join('\n'); +} +``` + +## Technical Scenarios + +### Scenario: Enriched SARIF Rule Descriptors + +The core problem is that GitHub cannot display rule help because the `help` property is missing from every `reportingDescriptor` in the SARIF output. + +**Requirements:** + +* Every rule must have `fullDescription.text`, `help.text`, and `help.markdown`. +* `help.markdown` should include: rule title, description, impact, WCAG criteria, principle, and a learn more link. +* `defaultConfiguration.level` must map from `violation.impact`. +* `properties` must include `precision` and `problem.severity`. + +**Preferred Approach:** + +Build a `buildHelpMarkdown()` function that generates structured markdown from `AxeViolation` data. Update the `SarifRule` interface and the `buildRun()` function to populate all GitHub-required fields. Keep `helpUri` for SARIF spec compliance but do not rely on it for display. + +```text +src/lib/report/sarif-generator.ts (modify — add fields to interface and buildRun) +src/lib/scanner/result-normalizer.ts (modify — fix IBM helpUrl and help text) +``` + +```mermaid +flowchart TD + A[AxeViolation] -->|id, description, help, helpUrl, impact, tags, principle| B[buildRun] + B --> C[SarifRule] + C -->|fullDescription.text| D[violation.description] + C -->|help.text| E[buildHelpText] + C -->|help.markdown| F[buildHelpMarkdown] + C -->|defaultConfiguration.level| G[mapImpactToLevel] + C -->|properties.precision| H[mapEngineToPrecision] + C -->|properties.problem.severity| I[mapImpactToSeverity] + B --> J[SarifResult] + J -->|message.text| K[enriched message with URL, selector, element count] +``` + +**Implementation Details:** + +1. Update `SarifRule` interface to include all GitHub-supported fields. +2. Add `buildHelpMarkdown(violation)` function. +3. Add `buildHelpText(violation)` function (plain text fallback). +4. Update `buildRun()` to populate new fields on each rule. +5. Enrich `SarifResult.message.text` with description and element count. +6. Add `tool.driver.informationUri` and `tool.driver.semanticVersion`. + +#### Considered Alternatives + +**Alternative A: Minimal fix — only add `help.text`** + +* Pros: Smallest change, addresses the "no rule help" issue. +* Cons: Loses the opportunity for rich markdown display. No structured help. +* Rejected because: `help.markdown` is the key differentiator for useful alerts. Minimal effort to add both. + +**Alternative B: Generate help content from a static WCAG mapping table** + +* Pros: Could provide detailed WCAG success criterion text and specific remediation guidance. +* Cons: Requires maintaining a mapping table. Current data already contains `description` and `help` fields with useful content. +* Rejected because: The violation data already contains sufficient content. A static table may become stale. Can be added later as an enhancement. + +**Alternative C: Use `result.message.markdown` for rich results** + +* Pros: Could render rich markdown in the result detail view. +* Cons: GitHub docs do **not** list `message.markdown` as a supported property. Relies on undocumented behavior. +* Rejected because: Unreliable. Use `message.text` with information-dense first sentence instead. + +### Scenario: Fix IBM Equal Access URLs + +The root cause is two bugs: wrong URL pattern in the normalizer, and the IBM `help` field being treated as text when it is actually a URL. + +**Requirements:** + +* IBM rule `helpUrl` must use the working archive URL pattern from the raw IBM data. +* The `help` text on the normalized violation must be human-readable (the message), not a URL. +* URLs in `help.markdown` must not have underscores escaped. + +**Preferred Approach:** + +Extract the base URL (before `#` fragment) from the raw IBM `help` field: `r.help`. Use it as `helpUrl`. Use `r.message` as the human-readable `help` text. If `r.help` is not a valid URL, fall back to constructing the archive URL from `r.ruleId`. + +```text +src/lib/scanner/result-normalizer.ts (modify lines 74-75) +``` + +**Implementation Details:** + +```typescript +// Add helper function +function extractIbmHelpUrl(rawHelp: string | undefined, ruleId: string): string { + if (rawHelp) { + try { + const url = new URL(rawHelp); + return `${url.origin}${url.pathname}`; // strip #fragment with encoded JSON + } catch { + // not a URL, fall through + } + } + return `https://able.ibm.com/rules/archives/latest/doc/en-US/${ruleId}.html`; +} + +// In normalizeIbmResults(): +help: r.message, // human-readable text, not the URL +helpUrl: extractIbmHelpUrl(r.help, r.ruleId), +``` + +#### Considered Alternatives + +**Alternative: Keep the `/rules/tools/help/{ruleId}` pattern** + +* Rejected because: This URL pattern produces 404s. The raw IBM data provides working archive URLs. + +**Alternative: Hardcode a specific archive version like `2026.03.04`** + +* Rejected because: The version changes over time. Extracting from the raw data is forward-compatible. + +## Summary + +| Area | Current State | Target State | Priority | +|---|---|---|---| +| `rules[].help.text` | Missing | Build from `violation.help` + `description` | **P0** | +| `rules[].help.markdown` | Missing | Rich markdown with title, impact, WCAG, learn more | **P0** | +| `rules[].fullDescription.text` | Missing | Set to `violation.description` | **P0** | +| IBM helpUrl | Wrong URL pattern | Extract from raw IBM `help` field | **P0** | +| IBM help text | Contains URL string | Use `r.message` instead | **P0** | +| `rules[].defaultConfiguration.level` | Missing | Map from `violation.impact` | **P1** | +| `rules[].properties.precision` | Missing | Map from engine (`very-high` for axe, `high` for IBM) | **P1** | +| `rules[].properties.problem.severity` | Missing | Map from impact | **P1** | +| `result.message.text` | Basic concatenation | Enriched: description + URL + element count | **P1** | +| `tool.driver.informationUri` | Missing | Link to GitHub repo | **P2** | +| `tool.driver.semanticVersion` | Missing | Same as version | **P2** | +| `helpUri` | Present but ignored | Keep for spec compliance | **P2** | + +### Files to Modify + +1. **`src/lib/report/sarif-generator.ts`** — Update `SarifRule` interface, add `buildHelpMarkdown()` and `buildHelpText()`, update `buildRun()` to populate all new fields, enrich `message.text`. +2. **`src/lib/scanner/result-normalizer.ts`** — Fix IBM `helpUrl` construction (extract from raw `help` field), fix IBM `help` text (use `r.message`). +3. **`src/lib/report/__tests__/sarif-generator.test.ts`** — Update tests for new fields. +4. **`src/lib/scanner/__tests__/result-normalizer.test.ts`** — Update IBM helpUrl and help text tests. diff --git a/.copilot-tracking/research/subagents/2026-03-12/github-sarif-spec-research.md b/.copilot-tracking/research/subagents/2026-03-12/github-sarif-spec-research.md new file mode 100644 index 0000000..9ee30b9 --- /dev/null +++ b/.copilot-tracking/research/subagents/2026-03-12/github-sarif-spec-research.md @@ -0,0 +1,423 @@ +# GitHub SARIF Specification Research for Code Scanning Display + +**Status:** Complete +**Date:** 2026-03-12 +**Topic:** SARIF v2.1.0 fields that produce rich display in GitHub Security tab + +--- + +## 1. Executive Summary + +GitHub Code Scanning uses a **specific subset** of SARIF v2.1.0 properties. The **critical missing piece** causing "no rule help available" in the current accessibility scanner is the absence of the `help` property (`help.text` and `help.markdown`) on `reportingDescriptor` (rule) objects. Additionally, the current generator is missing `fullDescription` on rules, which GitHub marks as **Required**. + +### Key Fix: Add `help.text` and `help.markdown` to every rule + +--- + +## 2. Complete Field Mapping: What GitHub Uses for Display + +### 2.1 reportingDescriptor (rules[]) — The Most Important Object + +This is where rule metadata lives. GitHub's documentation explicitly lists these properties: + +| Property | Required? | GitHub Display Usage | +|---|---|---| +| `id` | **Required** | Unique rule identifier. Used in URLs, filtering, and cross-referencing. | +| `name` | Optional | Displayed to allow filtering by rule. Limited to **255 characters**. | +| `shortDescription.text` | **Required** | Displayed next to associated results. Limited to **1024 characters**. | +| `fullDescription.text` | **Required** | Displayed next to associated results. Limited to **1024 characters**. | +| `defaultConfiguration.level` | Optional | Default severity: `note`, `warning`, `error`. Defaults to `warning`. | +| `help.text` | **Required** | Help documentation shown next to results. **This is the rule help panel content.** | +| `help.markdown` | Optional (Recommended) | **When present, displayed INSTEAD of `help.text`**. This is the rich expandable help content shown in alert detail. | +| `helpUri` | Not listed as supported | **NOT in GitHub's supported properties table.** See section 2.6 below. | +| `properties.tags[]` | Optional | Array of strings for filtering results on GitHub (e.g., `security`, `accessibility`). | +| `properties.precision` | Optional (Recommended) | `very-high`, `high`, `medium`, `low`. Results ordered by precision. | +| `properties.problem.severity` | Optional (Recommended) | For non-security: `error`, `warning`, `recommendation`. | +| `properties.security-severity` | Optional (Recommended for security) | Numeric `0.0–10.0`. Triggers security severity mapping: >9.0=critical, 7.0–8.9=high, 4.0–6.9=medium, 0.1–3.9=low. | + +### 2.2 result object — Per-Alert Data + +| Property | Required? | GitHub Display Usage | +|---|---|---| +| `ruleId` | Optional | Rule identifier. Used for filtering by rule. | +| `ruleIndex` | Optional | Index into `rules[]` array. | +| `rule` | Optional | Reference to the reporting descriptor. | +| `level` | Optional | Overrides `defaultConfiguration.level`. Values: `note`, `warning`, `error`. | +| `message.text` | **Required** | **Alert title/description.** First sentence shown when space is limited. | +| `locations[]` | **Required** | Physical locations. At least one required. Only first used for file annotation. Max 10. | +| `partialFingerprints` | **Required** | Fingerprint for deduplication. Only `primaryLocationLineHash` is used. | +| `codeFlows[].threadFlows[].locations[]` | Optional | If present, GitHub expands code flow visualization. | +| `relatedLocations[]` | Optional | Linked when embedded in result message via `[text](id)` syntax. | + +### 2.3 physicalLocation object + +| Property | Required? | +|---|---| +| `artifactLocation.uri` | **Required** — relative path from repo root recommended. | +| `region.startLine` | **Required** | +| `region.startColumn` | **Required** | +| `region.endLine` | **Required** | +| `region.endColumn` | **Required** | + +### 2.4 toolComponent object + +| Property | Required? | +|---|---| +| `name` | **Required** | +| `version` | Optional (not used if `semanticVersion` present) | +| `semanticVersion` | Optional (preferred over `version`) | +| `rules[]` | **Required** | + +### 2.5 sarifLog object + +| Property | Required? | +|---|---| +| `$schema` | **Required** — e.g., `https://json.schemastore.org/sarif-2.1.0.json` | +| `version` | **Required** — must be `"2.1.0"` | +| `runs[]` | **Required** | + +### 2.6 helpUri — The Missing "Learn More" Link + +**Critical finding:** `helpUri` is **NOT listed in GitHub's supported properties table** for `reportingDescriptor`. GitHub's documentation does not mention `helpUri` at all in its supported properties section. + +This means: +- GitHub does **not** render `helpUri` as a clickable "Learn more" link in the alert detail. +- If a "Learn more" link is desired, include it as a markdown link **within `help.markdown`**. +- The current broken links in alerts are likely because `helpUri` is the only reference and GitHub ignores it. + +**Workaround:** Embed the help URL directly in `help.markdown`: +```markdown +[Learn more about this rule](https://dequeuniversity.com/rules/axe/4.10/color-contrast) +``` + +--- + +## 3. The `help` Property — The Key Missing Piece + +### 3.1 What It Does + +Per SARIF v2.1.0 §3.49.13: The `help` property is a `multiformatMessageString` object containing: +- `text` (required on the object): Plain text documentation for the rule. +- `markdown` (optional): GitHub Flavored Markdown documentation. + +Per GitHub's documentation: +> **`help.text`** — Required. Documentation for the rule using text format. Code scanning displays this help documentation next to the associated results. +> +> **`help.markdown`** — Optional (Recommended). Documentation for the rule using Markdown format. Code scanning displays this help documentation next to the associated results. **When `help.markdown` is available, it is displayed instead of `help.text`.** + +### 3.2 How CodeQL Structures `help.markdown` + +CodeQL produces rich alerts with structured markdown help. The pattern is: + +```json +{ + "id": "js/xss", + "name": "CrossSiteScripting", + "shortDescription": { + "text": "Cross-site scripting vulnerability" + }, + "fullDescription": { + "text": "Writing user input directly to a web page allows for a cross-site scripting vulnerability." + }, + "help": { + "text": "# Cross-site scripting\n\nWriting user input directly to a web page...", + "markdown": "# Cross-site scripting\n\nWriting user input directly to a web page allows for a cross-site scripting vulnerability.\n\n## Recommendation\n\nSanitize all user input before...\n\n## Example\n\n```javascript\n// BAD\nresponse.write(req.query.name);\n```\n\n## References\n\n- [OWASP XSS Prevention](https://example.com)\n" + }, + "defaultConfiguration": { + "level": "error" + }, + "properties": { + "tags": ["security", "external/cwe/cwe-079"], + "precision": "high", + "security-severity": "6.1" + } +} +``` + +### 3.3 Recommended `help.markdown` Structure for Accessibility Rules + +```markdown +# Rule Title (e.g., "Ensure sufficient color contrast") + +Brief description of what the rule checks. + +## Why This Matters + +Explanation of accessibility impact and who is affected. + +## How to Fix + +Step-by-step remediation guidance. + +## WCAG Criteria + +- [WCAG 2.2 Success Criterion X.Y.Z](https://www.w3.org/WAI/WCAG22/Understanding/...) + +## Learn More + +- [Deque University: rule-name](https://dequeuniversity.com/rules/axe/4.10/rule-name) +- [WCAG Understanding Document](https://www.w3.org/WAI/WCAG22/Understanding/...) +``` + +--- + +## 4. `result.message` — Alert Title Display + +### 4.1 message.text + +**Required.** GitHub displays `message.text` as the alert title. Per GitHub docs: +> Only the first sentence of the message will be displayed when visible space is limited. + +### 4.2 message.markdown + +The SARIF spec (§3.11.9) supports `markdown` on message objects. However, GitHub's supported properties table for `result` only lists `message.text`. GitHub does **not** list `message.markdown` as a supported property. + +**Recommendation:** Use `message.text` with a clear, information-dense first sentence. Do not rely on `message.markdown` for result messages. + +--- + +## 5. Severity Mapping + +### 5.1 defaultConfiguration.level + +Maps directly to GitHub severity badges: +- `"error"` → Error (red) +- `"warning"` → Warning (yellow) +- `"note"` → Note (blue) + +Defaults to `"warning"` if absent. + +### 5.2 properties.security-severity + +For rules tagged with `security` in `properties.tags`, this numeric score (0.0–10.0) maps to: +- **>9.0** → Critical +- **7.0–8.9** → High +- **4.0–6.9** → Medium +- **0.1–3.9** → Low + +### 5.3 properties.problem.severity + +For non-security rules: `error`, `warning`, `recommendation`. Used with `precision` to order results. + +### 5.4 Recommended Mapping for Accessibility + +| axe-core Impact | `defaultConfiguration.level` | `properties.problem.severity` | +|---|---|---| +| critical | `error` | `error` | +| serious | `error` | `error` | +| moderate | `warning` | `warning` | +| minor | `note` | `recommendation` | + +--- + +## 6. Tags and Filtering + +`properties.tags[]` on rules allows GitHub filtering. Max **20 tags** per rule (only 10 displayed). + +Recommended tags for accessibility: +```json +{ + "tags": [ + "accessibility", + "WCAG2.2", + "WCAG2.1", + "level-A", // or "level-AA" + "cat.color", // axe-core category + "best-practice" // for best-practice rules + ] +} +``` + +--- + +## 7. Fingerprinting and Deduplication + +### 7.1 partialFingerprints + +**Required** by GitHub. Only `primaryLocationLineHash` is used. + +GitHub computes fingerprints from `partialFingerprints` if provided. The `upload-sarif` action can auto-compute if missing, but the API endpoint cannot. + +### 7.2 Recommended approach + +Include a hash based on: `ruleId + target selector + page URL`. + +--- + +## 8. Upload Limits + +| Limit | Value | Notes | +|---|---|---| +| File size (gzip) | **10 MB** max | | +| Runs per file | **20** | | +| Results per run | **25,000** | Only top 5,000 shown (by severity) | +| Rules per run | **25,000** | | +| Tool extensions per run | **100** | | +| Thread flow locations per result | **10,000** | Only top 1,000 shown | +| Locations per result | **1,000** | Only 100 shown | +| Tags per rule | **20** | Only 10 shown | +| Total alert limit | **1,000,000** | | + +--- + +## 9. Gaps in Current SARIF Generator + +Comparing the existing `sarif-generator.ts` against GitHub requirements: + +| Field | Current State | Required State | Priority | +|---|---|---|---| +| `rules[].fullDescription` | **MISSING** | Required by GitHub | **P0** | +| `rules[].help.text` | **MISSING** | Required by GitHub | **P0** | +| `rules[].help.markdown` | **MISSING** | Recommended — renders rich help | **P0** | +| `rules[].helpUri` | Present | **Not used by GitHub** — remove or keep | P2 | +| `rules[].defaultConfiguration.level` | **MISSING** | Optional but important for severity | **P1** | +| `rules[].properties.precision` | **MISSING** | Recommended for ordering | P1 | +| `rules[].properties.problem.severity` | **MISSING** | Recommended for ordering | P1 | +| `rules[].properties.security-severity` | **MISSING** | Only for security-tagged rules | P2 | +| `result.locations[].region.startLine` | **MISSING** (only snippet) | Required by GitHub | **P0** | +| `result.locations[].region.startColumn` | **MISSING** | Required by GitHub | **P0** | +| `result.locations[].region.endLine` | **MISSING** | Required by GitHub | **P0** | +| `result.locations[].region.endColumn` | **MISSING** | Required by GitHub | **P0** | +| `$schema` | Uses OASIS raw URL | Should use `https://json.schemastore.org/sarif-2.1.0.json` | P2 | + +--- + +## 10. Ideal SARIF Structure — Complete Example + +```json +{ + "$schema": "https://json.schemastore.org/sarif-2.1.0.json", + "version": "2.1.0", + "runs": [ + { + "tool": { + "driver": { + "name": "accessibility-scanner", + "semanticVersion": "1.0.0", + "informationUri": "https://github.com/devopsabcs-engineering/accessibility-scan-demo-app", + "rules": [ + { + "id": "color-contrast", + "name": "color-contrast", + "shortDescription": { + "text": "Ensure the contrast between foreground and background colors meets WCAG 2 AA minimum contrast ratio thresholds." + }, + "fullDescription": { + "text": "Ensures the contrast between foreground and background colors meets WCAG 2 AA minimum contrast ratio thresholds. Low contrast text is difficult or impossible for many users to read." + }, + "help": { + "text": "Elements must meet minimum color contrast ratio thresholds.\n\nFix any of the following:\n- Increase the contrast ratio between the foreground and background colors.\n- Use larger or bolder text.\n\nWCAG Criteria: 1.4.3 Contrast (Minimum) (Level AA)\n\nLearn more: https://dequeuniversity.com/rules/axe/4.10/color-contrast", + "markdown": "# Ensure sufficient color contrast\n\nElements must meet minimum color contrast ratio thresholds.\n\n## Why This Matters\n\nLow contrast text is difficult or impossible to read for many users, including those with low vision, color blindness, or age-related vision changes.\n\n## How to Fix\n\n- Increase the contrast ratio between foreground and background colors\n- Use a contrast ratio of at least **4.5:1** for normal text\n- Use a contrast ratio of at least **3:1** for large text (18pt or 14pt bold)\n\n## WCAG Criteria\n\n- [1.4.3 Contrast (Minimum) (Level AA)](https://www.w3.org/WAI/WCAG22/Understanding/contrast-minimum.html)\n\n## Learn More\n\n- [Deque University: color-contrast](https://dequeuniversity.com/rules/axe/4.10/color-contrast)\n" + }, + "helpUri": "https://dequeuniversity.com/rules/axe/4.10/color-contrast", + "defaultConfiguration": { + "level": "error" + }, + "properties": { + "tags": [ + "accessibility", + "WCAG2AA", + "cat.color", + "wcag143" + ], + "precision": "very-high", + "problem.severity": "error" + } + } + ] + } + }, + "results": [ + { + "ruleId": "color-contrast", + "ruleIndex": 0, + "level": "error", + "message": { + "text": "Element has insufficient color contrast of 2.52 (foreground: #6c757d, background: #ffffff, required ratio: 4.5:1). Found on https://example.com — button.btn-secondary" + }, + "locations": [ + { + "physicalLocation": { + "artifactLocation": { + "uri": "example.com/index.html" + }, + "region": { + "startLine": 1, + "startColumn": 1, + "endLine": 1, + "endColumn": 2, + "snippet": { + "text": "" + } + } + } + } + ], + "partialFingerprints": { + "primaryLocationLineHash": "a1b2c3d4:1" + } + } + ], + "columnKind": "utf16CodeUnits" + } + ] +} +``` + +--- + +## 11. relatedLocations and Embedded Links + +GitHub supports `relatedLocations[]` with embedded links in `message.text`: + +```json +{ + "message": { + "text": "Element has insufficient contrast. See [related element](0)." + }, + "relatedLocations": [ + { + "id": 0, + "physicalLocation": { ... }, + "message": { "text": "Related element" } + } + ] +} +``` + +The `[text](id)` syntax in message text creates clickable links to related locations. + +--- + +## 12. codeFlows + +GitHub will expand `codeFlows` if present. For accessibility scanning, this is generally not applicable since violations are typically single-location findings rather than execution path issues. + +--- + +## 13. How helpUri Is (Not) Processed + +**Key Finding:** `helpUri` is defined in SARIF v2.1.0 spec (§3.49.12) as a localizable absolute URI for primary documentation. However, GitHub's supported properties documentation for `reportingDescriptor` does **not list `helpUri`** at all. + +This means: +1. GitHub likely **ignores** `helpUri` entirely. +2. The "broken links" reported in the current tool are likely because `helpUri` is the only reference URL, and GitHub never renders it. +3. **Solution:** Embed help URLs in `help.markdown` as markdown links. + +Note: It is still harmless to include `helpUri` in the SARIF output (it's valid SARIF), but do not rely on it for display. + +--- + +## 14. References + +- [GitHub SARIF Support Documentation](https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning) +- [SARIF v2.1.0 Specification (OASIS)](https://docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html) +- [SARIF JSON Schema](https://json.schemastore.org/sarif-2.1.0.json) +- [GitHub CodeQL Action Fingerprints](https://github.com/github/codeql-action/blob/main/src/fingerprints.ts) +- [Microsoft SARIF Validator](https://sarifweb.azurewebsites.net/) + +--- + +## 15. Clarifying Questions + +None — all research questions have been answered through the GitHub documentation and SARIF specification. diff --git a/.copilot-tracking/research/subagents/2026-03-12/sarif-generator-analysis.md b/.copilot-tracking/research/subagents/2026-03-12/sarif-generator-analysis.md new file mode 100644 index 0000000..24aa46c --- /dev/null +++ b/.copilot-tracking/research/subagents/2026-03-12/sarif-generator-analysis.md @@ -0,0 +1,436 @@ +# SARIF Generator Gap Analysis for GitHub Code Scanning + +## Research Topics + +1. Current SARIF fields produced vs. SARIF v2.1.0 spec requirements for GitHub Code Scanning +2. IBM Equal Access URL pattern mismatch and encoding issues +3. Data available in AxeViolation/AxeNode that should flow into SARIF +4. Missing SARIF properties that GitHub Code Scanning needs for rich display + +## Status: Complete + +--- + +## 1. Current SARIF Output — Field Inventory + +### Source File + +`src/lib/report/sarif-generator.ts` — 142 lines total. + +### Current SarifRule (reportingDescriptor) Fields + +| Field | Present | Value Source | +|---|---|---| +| `id` | Yes | `violation.id` | +| `name` | Yes | `violation.id` (same as id — not a human-readable name) | +| `shortDescription.text` | Yes | `violation.description` | +| `helpUri` | Yes | `violation.helpUrl` (passed through from normalizer) | +| `properties.tags` | Yes | `violation.tags` | +| `fullDescription.text` | **MISSING** | GitHub marks this **Required** | +| `help.text` | **MISSING** | GitHub marks this **Required** | +| `help.markdown` | **MISSING** | GitHub **Recommended** — when present, displayed instead of `help.text` | +| `defaultConfiguration.level` | **MISSING** | GitHub **Optional** but used for severity display | +| `properties.precision` | **MISSING** | GitHub **Recommended** — affects result ordering | +| `properties.problem.severity` | **MISSING** | GitHub **Recommended** — affects result ordering | + +### Current SarifResult Fields + +| Field | Present | Value Source | +|---|---|---| +| `ruleId` | Yes | `violation.id` | +| `ruleIndex` | Yes | Index into rules array | +| `level` | Yes | Mapped from `violation.impact` via `mapImpactToLevel()` | +| `message.text` | Yes | `"{help} ({url} — {target})"` — plain text only | +| `locations[0].physicalLocation.artifactLocation.uri` | Yes | `urlToArtifactPath(url)` — converts URL to hostname/path | +| `locations[0].physicalLocation.region.snippet.text` | Yes | `node.html` | +| `partialFingerprints.primaryLocationLineHash` | Yes | Simple hash of `violation.id:target` | +| `relatedLocations` | **MISSING** | Could link multiple affected nodes | +| `codeFlows` | **MISSING** | Not applicable for accessibility | +| `locations[0].physicalLocation.region.startLine` | **MISSING** | Not available from DOM scanning | +| `locations[0].physicalLocation.region.startColumn` | **MISSING** | Not available from DOM scanning | +| `locations[0].message.text` | **MISSING** | Could carry failure summary | + +### Current SarifRun Fields + +| Field | Present | Value Source | +|---|---|---| +| `tool.driver.name` | Yes | `'accessibility-scanner'` | +| `tool.driver.version` | Yes | Passed as parameter | +| `tool.driver.rules` | Yes | Built from violations | +| `results` | Yes | Built from violation nodes | +| `tool.driver.semanticVersion` | **MISSING** | GitHub prefers this over `version` | +| `tool.driver.informationUri` | **MISSING** | Link to tool documentation | +| `automationDetails.id` | **MISSING** | Enables category-based filtering | +| `columnKind` | **MISSING** | Should be `"utf16CodeUnits"` per spec examples | + +### Top-Level SarifLog Fields + +| Field | Present | Notes | +|---|---|---| +| `$schema` | Yes | Uses OASIS spec URL | +| `version` | Yes | `'2.1.0'` | +| `runs` | Yes | One per URL scanned | + +--- + +## 2. IBM Equal Access URL Issue — Root Cause Analysis + +### Two Different URL Patterns + +**Pattern A — IBM raw result `help` field (from actual scan data):** + +```text +https://able.ibm.com/rules/archives/2026.03.04/doc/en-US/style_color_misuse.html#... +``` + +This is the *actual* URL embedded in the IBM Equal Access engine results. It points to a versioned archive path and includes rule-specific URL fragment with encoded JSON context. These URLs work correctly. + +**Pattern B — result-normalizer.ts line 75 (hardcoded in normalizer):** + +```text +https://able.ibm.com/rules/tools/help/${r.ruleId} +``` + +This is a *different, generic* URL pattern that the normalizer substitutes. It uses `/rules/tools/help/` instead of the versioned archive path. + +### The Core Problem + +The `normalizeIbmResults()` function at `src/lib/scanner/result-normalizer.ts:75` **discards the IBM raw `help` URL** and replaces it with a constructed URL: + +```typescript +helpUrl: `https://able.ibm.com/rules/tools/help/${r.ruleId}`, +``` + +However, the IBM raw result object has a working `help` field containing the full archive URL: + +```json +"help": "https://able.ibm.com/rules/archives/2026.03.04/doc/en-US/style_color_misuse.html#..." +``` + +The normalizer maps `r.help ?? r.message` to the `help` text field, treating the IBM `help` property as a text description rather than recognizing it as a URL. + +### The Underscore Encoding Issue + +The user reports GitHub Code Scanning showing broken links like: + +```text +Cannot GET /rules/archives/2026.03.04/doc/en-US/label\_name\_visible.html +``` + +Note the `\_` (backslash-escaped underscores). This happens because: + +1. The SARIF `helpUri` contains a URL with underscores in the rule ID (e.g., `label_name_visible`) +2. Somewhere in the rendering pipeline (likely GitHub's markdown processing of SARIF content), underscores are being treated as markdown emphasis delimiters and backslash-escaped +3. The escaped underscores (`\_`) are passed through to the HTTP request, resulting in a 404 + +### Root Cause Summary + +There are actually **two bugs**: + +1. **Wrong URL pattern**: The normalizer substitutes `/rules/tools/help/{ruleId}` instead of using the IBM-provided archive URL (`/rules/archives/{version}/doc/en-US/{ruleId}.html`) +2. **Potential markdown escaping**: If the URL passes through any markdown processing step, underscores in rule IDs like `label_name_visible` get backslash-escaped to `label\_name\_visible` + +### Fix Approach + +- Use the IBM raw `help` URL directly (strip the `#fragment` portion containing encoded JSON context) +- Or construct URLs using the archive pattern: `https://able.ibm.com/rules/archives/{version}/doc/en-US/{ruleId}.html` +- Ensure `helpUri` URLs are never processed as markdown + +--- + +## 3. Data Available in AxeViolation/AxeNode Not Flowing to SARIF + +### AxeViolation Fields (from `src/lib/types/scan.ts`) + +| Field | Type | Used in SARIF | Notes | +|---|---|---|---| +| `id` | `string` | Yes — `ruleId`, `rule.id`, `rule.name` | `name` should be human-readable, not same as `id` | +| `impact` | `'minor'\|'moderate'\|'serious'\|'critical'` | Yes — mapped to `level` | Also usable for `defaultConfiguration.level` | +| `tags` | `string[]` | Yes — `properties.tags` | Could also drive `properties.precision` | +| `description` | `string` | Yes — `shortDescription.text` | Should also populate `fullDescription.text` | +| `help` | `string` | Partial — used in `message.text` | **Should populate `help.text`** | +| `helpUrl` | `string` | Yes — `helpUri` | Broken for IBM (see section 2) | +| `nodes` | `AxeNode[]` | Partial | Only first node's html used as snippet | +| `principle` | `string?` | **No** | Could be added to `properties.tags` or `properties` bag | +| `engine` | `string?` | **No** | Could be added to `properties` bag | + +### AxeNode Fields (from `src/lib/types/scan.ts`) + +| Field | Type | Used in SARIF | Notes | +|---|---|---|---| +| `html` | `string` | Yes — `region.snippet.text` | Good | +| `target` | `string[]` | Partial — in `message.text` | Could be in `location.message.text` | +| `impact` | `string` | **No** | Node-level impact ignored | +| `failureSummary` | `string?` | **No** | **Rich data lost** — should appear in help or message | + +### Data Available in HTML Report But Missing from SARIF + +The `ViolationList.tsx` component renders all this data for each violation: + +1. **Impact badge** — severity level (critical/serious/moderate/minor) — ✅ in SARIF as `level` +2. **Help text** (violation.help) — the concise rule summary — ❌ NOT in SARIF `help.text` +3. **Rule ID** (violation.id) — ✅ in SARIF +4. **Affected elements count** (violation.nodes.length) — ❌ NOT in SARIF +5. **Description** (violation.description) — ✅ in `shortDescription` only +6. **HTML snippet** per node (node.html) — ✅ in `region.snippet.text` +7. **Failure summary** per node (node.failureSummary) — ❌ NOT in SARIF +8. **CSS selector** per node (node.target) — partial in `message.text` only +9. **Learn more link** (violation.helpUrl) — ✅ in `helpUri` (but broken for IBM) +10. **Principle grouping** (violation.principle) — ❌ NOT in SARIF properties + +--- + +## 4. Missing SARIF Properties for GitHub Code Scanning Rich Display + +### Critical Missing Properties (Required by GitHub) + +#### `fullDescription.text` — Required + +GitHub displays this alongside results. Currently absent; `shortDescription.text` is set to `violation.description` but `fullDescription` is not set at all. + +**Fix**: Set `fullDescription.text` to `violation.description` (same as shortDescription, or expand with WCAG criteria). + +#### `help.text` — Required + +GitHub displays this as "Rule help" documentation. Without it, GitHub shows "no rule help available for this alert." + +**Fix**: Set `help.text` to `violation.help` — the concise rule guidance. + +#### `help.markdown` — Recommended (displayed instead of `help.text` when present) + +This is the key property for rich rule documentation in GitHub Code Scanning. When present, GitHub renders it as formatted markdown alongside alerts. + +**Recommended content for `help.markdown`**: + +```markdown +## {violation.help} + +{violation.description} + +**Impact**: {violation.impact} +**WCAG Criteria**: {wcag tags joined} +**Principle**: {violation.principle} + +[Learn more]({violation.helpUrl}) +``` + +### Important Missing Properties (Recommended by GitHub) + +#### `defaultConfiguration.level` + +Maps impact to SARIF level. Used by GitHub to establish default severity when `result.level` is not set. + +```typescript +defaultConfiguration: { + level: mapImpactToLevel(violation.impact) // 'error' | 'warning' | 'note' +} +``` + +#### `properties.precision` + +GitHub uses this with severity to order results. Accessibility scanner results are generally high confidence. + +```typescript +properties: { + precision: 'high', // axe-core rules are well-tested + // or 'medium' for IBM potentialviolation +} +``` + +#### `properties.problem.severity` + +Non-security result severity. Maps to impact: + +```typescript +properties: { + 'problem.severity': violation.impact === 'critical' || violation.impact === 'serious' + ? 'error' + : violation.impact === 'moderate' ? 'warning' : 'recommendation' +} +``` + +### Nice-to-Have Missing Properties + +#### `tool.driver.semanticVersion` + +GitHub prefers this over `version` for tracking tool version changes. + +#### `tool.driver.informationUri` + +Link to the tool's documentation/homepage. + +#### `automationDetails.id` + +Enables category-based analysis runs. Example: `"accessibility-scan/wcag2aa/"`. + +#### `relatedLocations` + +When a violation affects multiple nodes, only the first becomes the primary `location`. Additional nodes could be listed as `relatedLocations` with their own snippets and selectors. + +--- + +## 5. Existing SARIF Examples in Codebase + +### Prior Research Example (from `.copilot-tracking/research/subagents/2026-03-06/cicd-integration-research.md`) + +Contains an ideal SARIF rule structure with all GitHub-supported fields: + +```json +{ + "id": "color-contrast", + "name": "color-contrast", + "shortDescription": { "text": "Elements must meet minimum color contrast ratio thresholds" }, + "fullDescription": { "text": "Ensures the contrast between foreground and background colors meets WCAG 2 AA minimum contrast ratio thresholds" }, + "helpUri": "https://dequeuniversity.com/rules/axe/4.11/color-contrast", + "help": { + "text": "Elements must meet minimum color contrast ratio thresholds", + "markdown": "Elements must meet minimum color contrast ratio thresholds. [More info](https://dequeuniversity.com/rules/axe/4.11/color-contrast)" + }, + "defaultConfiguration": { "level": "error" }, + "properties": { + "tags": ["wcag2aa", "wcag143", "accessibility"], + "precision": "high", + "problem.severity": "error" + } +} +``` + +### Test Fixture Example (from `sarif-generator.test.ts`) + +Test violations use axe-core help URLs which work fine: + +```typescript +helpUrl: 'https://dequeuniversity.com/rules/axe/4.0/color-contrast' +``` + +### Raw IBM Result Example (from `results/https_/example.com.json`) + +IBM results contain full archive URLs in the `help` field: + +```json +{ + "ruleId": "style_color_misuse", + "help": "https://able.ibm.com/rules/archives/2026.03.04/doc/en-US/style_color_misuse.html#...", + "message": "Verify color is not used as the only visual means of conveying information", + "snippet": "