Skip to content

Add partial-run support and fix implicit skip handling in tier audit#236

Open
jeffhandley wants to merge 3 commits intomodelcontextprotocol:mainfrom
jeffhandley:jeffhandley/tier-audit
Open

Add partial-run support and fix implicit skip handling in tier audit#236
jeffhandley wants to merge 3 commits intomodelcontextprotocol:mainfrom
jeffhandley:jeffhandley/tier-audit

Conversation

@jeffhandley
Copy link
Copy Markdown
Contributor

Two improvements to the tier-check CLI, split into separate commits for reviewability.

Motivation and Context

When the tier audit skill is invoked with only server or only client conformance results (or neither), the omitted suite's checks were reported as failures rather than being recognized as intentionally absent. This produced misleading tier classifications and noisy reports. Additionally, there was no way to scope an audit to just conformance tests or just repository health checks, meaning every run paid the full cost even when only part of the audit was needed.

How Has This Been Tested?

  • Added unit tests for resolveTierCheckPlan() covering implicit skip detection, nothing-to-run validation, and explicit scope exclusions (3 new tests)
  • Full test suite passes on each commit independently (93/93 on commit 1, 95/95 on commit 2)
  • Lint (eslint + prettier) passes on each commit independently

Breaking Changes

None. All new CLI flags are optional and the default behavior (full audit) is unchanged.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Commit 1 — Fix misleading tier results when conformance inputs are omitted

  • Extracts resolveTierCheckPlan() to detect when conformance inputs are absent and mark those suites as skipped rather than failed
  • Adds partial_run field to TierScorecard (derived from actual check statuses)
  • Updates output formatters: skipped checks display as , conformance tables and totals gracefully handle absent suites, and a partial-run banner is shown when applicable
  • Includes a pre-existing prettier fix in authorization-server-metadata.ts

Commit 2 — Add partial-run support with --skip-* flags and --scope presets

  • Adds CLI flags: --skip-server-conformance, --skip-client-conformance, --skip-conformance, --skip-repo-health
  • Adds --scope presets in SKILL.md (server, client, conformance, repo-health, full) that the AI skill layer expands to --skip-* flags
  • GitHub token is only required when repo-health checks are included
  • Exits with a clear error if all checks are skipped

Sample output

"/mcp-sdk-tier-audit for modelcontextprotocol/csharp-sdk with a scope of repo-health only"

MCP SDK Tier Audit: modelcontextprotocol/csharp-sdk

Date: 2026-04-17
Branch: main
Scope: partial run — repo-health only (conformance and docs eval skipped via --scope repo-health)
Auditor: mcp-sdk-tier-audit skill (automated + subagent evaluation)

Tier Assessment: N/A (partial run)

This is a partial assessment covering repository health checks only. Server/client conformance tests and documentation coverage evaluation were excluded by scope. No definitive tier classification is assigned.

Requirements Summary

# Requirement Tier 1 Standard Tier 2 Standard Current Value T1? T2? Gap
1a Server Conformance 100% pass rate >= 80% pass rate ○ skipped (excluded by scope) N/A N/A
1b Client Conformance 100% pass rate >= 80% pass rate ○ skipped (excluded by scope) N/A N/A
2 Issue Triage >= 90% within 2 biz days >= 80% within 1 month 97.4% (111/114 within SLA; median 58.4 h, p95 61.7 h) PASS PASS
2b Labels 12 required labels 12 required labels 12/12 present PASS PASS
3 Critical Bug Resolution All P0s within 7 days All P0s within 2 weeks 0 open P0s; all resolved within 7 d PASS PASS
4 Stable Release Required + clear versioning At least one stable release v1.2.0 (stable, non-prerelease) PASS PASS
4b Spec Tracking Timeline agreed per release Within 6 months SDK release within 0 days of spec release (2025-11-25) PASS PASS
5 Documentation Comprehensive w/ examples Basic docs for core features ○ skipped (excluded by scope) N/A N/A
6 Dependency Policy Published update policy Published update policy Dependabot configured (weekly NuGet + GH Actions, grouped, PR limits) PASS PASS
7 Roadmap Published roadmap Plan toward Tier 1 docs/roadmap.md with spec-revision boards, milestones, focus areas PASS PASS
8 Versioning Policy Documented breaking change policy N/A docs/versioning.md with SemVer 2.0.0, breaking-change labels, support policy PASS N/A

Tier Determination

No tier assigned — partial run (repo-health only). A full audit is required for tier classification.

All repo-health checks that ran are passing at Tier 1 level. Conformance testing (1a, 1b) and documentation evaluation (5) remain unevaluated and are required before a tier can be assigned.


Server Conformance

○ Not run — excluded by scope (--scope repo-health)

Client Conformance

○ Not run — excluded by scope (--scope repo-health)

Issue Triage Details

Metric Value
Compliance rate 97.4%
Total issues 114
Triaged within SLA 111
Exceeding SLA 2
Median triage time 58.4 h
p95 triage time 61.7 h

All 12 required labels are present: bug, enhancement, question, needs confirmation, needs repro, ready for work, good first issue, help wanted, P0, P1, P2, P3. Issue types are not used (label-based triage).

The triage rate of 97.4% exceeds both the Tier 1 threshold (≥ 90%) and the Tier 2 threshold (≥ 80%). Only 2 issues exceeded the SLA window. Median and p95 triage times (58.4 h and 61.7 h respectively) are well within the 2-business-day Tier 1 standard.

Critical Bug Resolution

No open P0 issues. No P0s were closed during the evaluation window. The all_p0s_resolved_within_7d and all_p0s_resolved_within_14d flags both report true, satisfying both Tier 1 (7-day) and Tier 2 (14-day) standards.

Stable Release & Spec Tracking

  • Latest stable release: v1.2.0 (non-prerelease)
  • Latest spec release: 2025-11-25
  • Latest SDK release: 2025-11-25
  • Gap: 0 days — SDK released same day as the spec release

Meets Tier 1 requirements for both stable release availability and spec tracking responsiveness.

Policy Evaluation

Dependency Update Policy: PASS

Dependabot is configured with weekly NuGet and GitHub Actions updates, grouped dependencies, ignore rules for product dependencies, and PR limits. A DEPENDENCY_POLICY.md or docs/dependency-policy.md file is not present, but the active Dependabot configuration (.github/dependabot.yml) demonstrates a published and enforced update policy.

Roadmap: PASS (Tier 1 and Tier 2)

docs/roadmap.md is present. It references org-level spec revision project boards, documents current focus areas (next spec revision, Tasks experimental support, end-to-end scenarios), and links to GitHub milestones for planned versions.

Versioning Policy: PASS (Tier 1)

docs/versioning.md comprehensively documents SemVer 2.0.0 adherence, defines what constitutes breaking changes (incompatible API changes, spec schema changes), explains how breaking changes are communicated (release notes, breaking-change labels, [Obsolete] attributes with MCP-prefixed diagnostics), and details the support policy for MAJOR/MINOR/PATCH versions.

Missing Policy Files

The following policy files were not found but are not blocking for tier compliance:

  • CHANGELOG.md — not present (release notes may be published via GitHub Releases instead)
  • BREAKING_CHANGES.md — not present (breaking changes are documented in docs/versioning.md and communicated via labels)

Documentation Coverage

○ Not run — excluded by scope (--scope repo-health)


Raw CLI Data

CLI JSON output (click to expand)
{
  "repo": "modelcontextprotocol/csharp-sdk",
  "branch": "main",
  "timestamp": "2026-04-17T06:56:51.484Z",
  "version": "1.2.0",
  "partial_run": true,
  "checks": {
    "conformance": { "status": "skipped" },
    "client_conformance": { "status": "skipped" },
    "labels": {
      "status": "pass",
      "present": 12,
      "required": 12,
      "missing": [],
      "found": ["bug","enhancement","question","needs confirmation","needs repro","ready for work","good first issue","help wanted","P0","P1","P2","P3"],
      "uses_issue_types": false
    },
    "triage": {
      "status": "pass",
      "compliance_rate": 0.9736842105263158,
      "total_issues": 114,
      "triaged_within_sla": 111,
      "exceeding_sla": 2,
      "median_hours": 58.4,
      "p95_hours": 61.7
    },
    "p0_resolution": {
      "status": "pass",
      "open_p0s": 0,
      "open_p0_details": [],
      "closed_within_7d": 0,
      "closed_within_14d": 0,
      "closed_total": 0,
      "all_p0s_resolved_within_7d": true,
      "all_p0s_resolved_within_14d": true
    },
    "stable_release": {
      "status": "pass",
      "version": "1.2.0",
      "is_stable": true,
      "is_prerelease": false
    },
    "policy_signals": {
      "status": "partial",
      "files": {
        "CHANGELOG.md": false,
        "SECURITY.md": true,
        "CONTRIBUTING.md": true,
        "DEPENDENCY_POLICY.md": false,
        "docs/dependency-policy.md": false,
        ".github/dependabot.yml": true,
        ".github/renovate.json": false,
        "renovate.json": false,
        "ROADMAP.md": false,
        "docs/roadmap.md": true,
        "VERSIONING.md": false,
        "docs/versioning.md": true,
        "BREAKING_CHANGES.md": false
      }
    },
    "spec_tracking": {
      "status": "pass",
      "latest_spec_release": "2025-11-25T21:17:42Z",
      "latest_sdk_release": "2025-11-25T23:44:39Z",
      "sdk_release_within_30d": true,
      "days_gap": 0
    }
  }
}

Remediation Guide: modelcontextprotocol/csharp-sdk

Date: 2026-04-17
Scope: partial run — repo-health only (conformance and docs eval skipped via --scope repo-health)
Current Tier: N/A (partial run)

Findings from scoped run

All repo-health checks passed. No remediation items were identified in the evaluated scope.

# Finding Status Detail
1 Labels PASS 12/12 required labels present
2 Issue Triage PASS 97.4% triaged (111/114) — exceeds Tier 1 threshold (≥90%) and Tier 2 threshold (≥80%)
3 P0 Resolution PASS 0 open P0 issues
4 Stable Release PASS v1.2.0 published
5 Spec Tracking PASS 0-day gap between spec releases and SDK tracking updates
6 Dependency Policy PASS Dependabot configured with weekly NuGet and GitHub Actions updates
7 Roadmap PASS docs/roadmap.md with concrete focus areas and milestone links
8 Versioning Policy PASS docs/versioning.md with comprehensive SemVer policy, breaking change definitions, and communication strategy (meets Tier 1)

Not evaluated (excluded by scope)

The following areas were not assessed in this partial run and require a full audit (--scope full) to evaluate:

  • Server conformance test pass rate
  • Client conformance test pass rate
  • Documentation coverage (feature-by-feature)

A full audit is required to enumerate all tier gaps and assign a definitive tier classification.

Recommended Next Steps

  1. Run a full audit (--scope full) to obtain server/client conformance pass rates, documentation coverage scores, and a definitive tier classification.
  2. Repo-health posture is strong — prioritize conformance test coverage as the likely differentiator for tier placement.
  3. Review any newly added spec features since the last conformance run to ensure test scenarios exist for them.
"/mcp-sdk-tier-audit for modelcontextprotocol/csharp-sdk. Use the main branch. Skip repo health and docs."

MCP SDK Tier Audit: modelcontextprotocol/csharp-sdk

Date: 2026-04-17
Branch: main
Scope: partial run — conformance only (repo health and docs eval skipped via --skip-repo-health --skip-docs-eval)
Auditor: mcp-sdk-tier-audit skill (automated + subagent evaluation)

Tier Assessment: N/A (partial run)

This is a partial assessment covering server and client conformance tests only. Both server and client achieve a 100% pass rate on all date-versioned scenarios. Three extension-only client auth failures were observed (not scored for tier). No tier is assigned — repo health and documentation evaluation were not performed.


Requirements Summary

# Requirement Tier 1 Standard Tier 2 Standard Current Value T1? T2? Gap
1a Server Conformance 100% pass rate ≥ 80% pass rate 100% (30/30) PASS PASS None
1b Client Conformance 100% pass rate ≥ 80% pass rate 100% (23/23 date-versioned) PASS PASS None (3 extension failures not scored)
2 Issue Triage All P0/P1 triaged within 2 business days All P0/P1 triaged within 5 business days ○ skipped (excluded by scope) N/A N/A
2b Labels bug/enhancement/question labels in use At least bug label in use ○ skipped (excluded by scope) N/A N/A
3 Critical Bug Resolution P0 resolved within 7 days P0 resolved within 14 days ○ skipped (excluded by scope) N/A N/A
4 Stable Release Stable (non-preview) release published Pre-release or stable release ○ skipped (excluded by scope) N/A N/A
4b Spec Tracking Tracks latest spec within 30 days Tracks latest spec within 90 days ○ skipped (excluded by scope) N/A N/A
5 Documentation API reference + getting-started guide README with basic usage ○ skipped (excluded by scope) N/A N/A
6 Dependency Policy Documented dependency/update policy Dependencies reasonably up to date ○ skipped (excluded by scope) N/A N/A
7 Roadmap Public roadmap or project board Milestones or tagged issues ○ skipped (excluded by scope) N/A N/A
8 Versioning Policy SemVer with documented policy SemVer or consistent scheme ○ skipped (excluded by scope) N/A N/A

Tier Determination

No tier assigned — partial run (conformance only). A full audit is required for tier classification.

Conformance Matrix

2025-03-26 2025-06-18 2025-11-25 All date-versioned
Server 22/22 30/30 30/30 (100%)
Client: Core 2/2 4/4 4/4 (100%)
Client: Auth 2/2 3/3 14/14 19/19 (100%)
Client Total 2/2 5/5 18/18 23/23 (100%)

Informational (not scored for tier):

draft extension
Client: Auth 3/3 0/3

No baseline file found.


Server Conformance Details

Pass rate: 100% (30/30)

All 30 server scenarios pass. They span spec versions 2025-06-18 and 2025-11-25.

  • 2025-06-18: 22 scenarios, all pass (most also apply to 2025-11-25)
  • 2025-11-25: 30 scenarios total (includes ones also tagged 2025-06-18), all pass
  • 2025-11-25 only (4 scenarios): all pass
    • sse-multiple-streams — PASS
    • elicitation-sep1330-enums — PASS
    • elicitation-sep1034-defaults — PASS
    • dns-rebinding-protection — PASS
# Scenario Spec Version(s) Status
1 sse-multiple-streams 2025-11-25 ✅ PASS
2 elicitation-sep1330-enums 2025-11-25 ✅ PASS
3 elicitation-sep1034-defaults 2025-11-25 ✅ PASS
4 dns-rebinding-protection 2025-11-25 ✅ PASS
5–30 (remaining 26 scenarios) 2025-06-18, 2025-11-25 ✅ PASS

All 30 scenarios: PASS


Client Conformance Details

Date-versioned pass rate: 100% (23/23)
Full suite (incl. informational): 26/26 scored + 3 extension failures

Core Scenarios (4/4)

Scenario Spec Version(s) Status
initialize 2025-06-18, 2025-11-25 ✅ PASS
tools_call 2025-06-18, 2025-11-25 ✅ PASS
sse-retry 2025-11-25 ✅ PASS
elicitation-sep1034-client-defaults 2025-11-25 ✅ PASS

Auth Scenarios — Date-versioned (19/19)

2025-03-26 (2/2):

Scenario Status
auth/oauth-metadata-backcompat ✅ PASS
auth/oauth-endpoint-fallback ✅ PASS

2025-06-18 (3/3):

Scenario Status
auth/token-endpoint-auth-post ✅ PASS
auth/token-endpoint-auth-none ✅ PASS
auth/token-endpoint-auth-basic ✅ PASS

2025-11-25 (14/14):

Scenario Status
auth/scope-step-up ✅ PASS
auth/scope-retry-limit ✅ PASS
auth/scope-omitted-when-undefined ✅ PASS
auth/scope-from-www-authenticate ✅ PASS
auth/scope-from-scopes-supported ✅ PASS
auth/pre-registration ✅ PASS
auth/metadata-var1 ✅ PASS
auth/metadata-var2 ✅ PASS
auth/metadata-var3 ✅ PASS
auth/metadata-default ✅ PASS
auth/basic-cimd ✅ PASS
auth/token-endpoint-auth-post ✅ PASS
auth/token-endpoint-auth-none ✅ PASS
auth/token-endpoint-auth-basic ✅ PASS

Auth Scenarios — Draft (3/3, informational)

Draft scenarios are informational and not scored for tier classification.

Scenario Spec Version Status
auth/resource-mismatch draft ✅ PASS
auth/offline-access-scope draft ✅ PASS
auth/offline-access-not-supported draft ✅ PASS

Auth Scenarios — Extension (0/3, informational)

Extension scenarios are informational and not scored for tier classification. All three failures show 8 of 10 checks passing.

Scenario Spec Version Checks Passed Status
auth/cross-app-access-complete-flow extension 8/10 ❌ FAIL
auth/client-credentials-jwt extension 8/10 ❌ FAIL
auth/client-credentials-basic extension 8/10 ❌ FAIL

Issue Triage Details

○ Not run — excluded by scope (--skip-repo-health)

Documentation Coverage

○ Not run — excluded by scope (--skip-docs-eval)

Policy Evaluation

○ Not run — excluded by scope (--skip-repo-health)

Remediation Guide: modelcontextprotocol/csharp-sdk

Date: 2026-04-17
Scope: partial run — conformance only (repo health and docs eval skipped via --skip-repo-health --skip-docs-eval)
Current Tier: N/A (partial run)

Findings from scoped run

All date-versioned conformance scenarios (server and client) pass at 100%. No remediation is required for tier-scored conformance tests.

# Finding Status Detail
1 Server Conformance ✓ PASS 30/30 (100%)
2 Client Conformance (date-versioned) ✓ PASS 23/23 (100%)
3 Client Conformance (extension) Informational 0/3 — not scored for tier

Extension Scenario Failures (informational)

The 3 failing extension scenarios each pass 8/10 checks:

  • auth/cross-app-access-complete-flow
  • auth/client-credentials-jwt
  • auth/client-credentials-basic

These are extension scenarios and do not block tier advancement. They represent client_credentials grant type support which is not yet part of the dated MCP spec.

Not evaluated (excluded by scope)

The following areas were not assessed in this partial run and require a full audit to evaluate:

  • Issue triage compliance
  • Label taxonomy
  • P0 bug resolution times
  • Stable release status
  • Spec tracking gap
  • Documentation coverage
  • Dependency update policy
  • Roadmap
  • Versioning policy

A full audit (--scope full) is required to enumerate all tier gaps and assign a definitive tier classification.

Recommended Next Steps

  1. Run a full tier audit to get definitive tier classification — repo health passed in a separate scoped run, so a full run should confirm overall tier status
  2. Investigate 3 extension scenario failures if client_credentials support is planned
  3. Consider adding a baseline.yml to document expected extension failures

jeffhandley and others added 2 commits April 16, 2026 23:27
Previously, omitting --conformance-server-url or --client-cmd would
produce a definitive tier classification despite incomplete data.

Extract resolveTierCheckPlan() to detect missing inputs and treat
them as implicit skips. partial_run is derived from actual check
statuses (any skipped check -> partial) rather than only from the
explicit --skip-conformance flag.

- Add resolveTierCheckPlan() with unit test for implicit skip
- Add checks/skipped.ts with factory functions for skipped payloads
- Handle skipped status in tier-logic, output formatters (empty
  circle symbol), and conformance matrix display
- Add partial_run field to TierScorecard type
- Split CLI checks docs into Conformance Tests / Repository Health
- Update SKILL.md partial-run guidance and report template

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Expand the tier-check CLI with granular skip flags so users can run
targeted subsets of the audit:

  --skip-server-conformance   skip only server conformance
  --skip-client-conformance   skip only client conformance
  --skip-conformance          skip both (server + client)
  --skip-repo-health          skip all GitHub-backed repo-health checks

The skill layer adds --scope presets (server, client, conformance,
repo-health, full) that expand to the appropriate skip flags, plus
--skip-docs-eval for the AI documentation evaluation.

--skip-repo-health also suppresses the AI policy evaluation in the
skill layer, keeping repo-health behavior aligned end-to-end.

When any check is skipped the run is partial: tier classification
shows N/A and skipped rows display as empty circles. A nothing-to-run
guard exits early when every check category is excluded.

- Expand resolveTierCheckPlan() with skip flag parameters
- Add CLI options and nothing-to-run guard
- Token now optional when --skip-repo-health excludes all API calls
- Add unit tests for nothing-to-run and explicit scope exclusions
- Add scope presets and skip flag docs to SKILL.md and READMEs
- Add partial-run examples to root README.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 17, 2026

Open in StackBlitz

npx https://pkg.pr.new/@modelcontextprotocol/conformance@236

commit: a1fc0b2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant