-
Notifications
You must be signed in to change notification settings - Fork 216
feat: add score command #2648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
adamaltman
wants to merge
23
commits into
main
Choose a base branch
from
aa/api-score
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feat: add score command #2648
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
81e83a2
feat: add score command
adamaltman 7c277f9
chore: add more tests
adamaltman f9bd3e7
refactor: address PR review feedback for score command
adamaltman a98ad59
fix: add missing version property to test CommandArgs
adamaltman 95fe9f0
chore: add e2e
RomanHotsiy 1fc409c
chore: refactor the visitor to use native pattern
RomanHotsiy f7d2699
chore: fix semantic line breaks
adamaltman 54303d9
chore: refactor based on feedback
adamaltman 3cb0f51
fix: adjust walking method and add debugging arguments
adamaltman a150a07
chore: address PR review feedback from tatomyr
adamaltman 126f5e0
fix: resolve CI typecheck failures
adamaltman 8439a76
refactor: extract collectMetrics and add schema memoization
adamaltman 4376c02
chore: add median to raw metrics summary and fix label
adamaltman 483a0aa
refactor: rename workflow to dependency across score command
adamaltman dbcdb86
chore: address PR review feedback
adamaltman 4e5301f
chore: address tatomyr review feedback (round 3)
adamaltman 916b03a
fix: address PR review feedback and bugbot findings
adamaltman 6b9d017
fix: address bugbot findings (round 2)
adamaltman bd6d82a
refactor: consolidate Integration Simplicity and Agent Readiness into…
adamaltman 47a9667
fix: address bugbot findings (round 3)
adamaltman edcd5a7
refactor: address tatomyr review feedback (round 4)
adamaltman 64884e6
fix: address bugbot findings (round 4)
adamaltman db14ed7
Merge branch 'main' into aa/api-score
adamaltman File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| --- | ||
| '@redocly/cli': minor | ||
| --- | ||
|
|
||
| Added new `score` command that analyzes OpenAPI 3.x descriptions and produces an AI Agent Readiness score (0-100). | ||
| Reports normalized subscores, raw per-operation metrics, and top hotspot operations with human-readable explanations. Supports `--format=stylish` (default) and `--format=json` output. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,133 @@ | ||
| # `score` | ||
|
|
||
| ## Introduction | ||
|
|
||
| The `score` command analyzes an OpenAPI description and produces a composite **Agent Readiness** score (0–100) that measures how easy the API is to integrate and how usable it is by AI agents and LLM-based tooling. Higher is better. | ||
|
|
||
| In addition to the top-level score, the command reports normalized subscores, raw metrics for every operation, and a list of **hotspot operations** — the endpoints most likely to cause integration friction — along with human-readable explanations. | ||
|
|
||
| {% admonition type="warning" name="Important" %} | ||
| The `score` command is considered an experimental feature. This means it's still a work in progress and may go through major changes. | ||
|
|
||
| The `score` command supports OpenAPI 3.x descriptions only. | ||
| {% /admonition %} | ||
|
|
||
| ### Metrics | ||
|
|
||
| The following raw metrics are collected per operation and aggregated across the document: | ||
|
|
||
| | Metric | Description | | ||
| | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | Parameter count | Total parameters (path, query, header, cookie) per operation. | | ||
| | Required parameter count | How many of those parameters are required. | | ||
| | Request body presence | Whether the operation defines a request body. | | ||
| | Top-level writable field count | Number of non-`readOnly` top-level properties in request schemas. | | ||
| | Max request/response schema depth | Deepest nesting level in request and response schemas. | | ||
| | Polymorphism count | Number of `oneOf`, `anyOf`, and `allOf` usages. `anyOf` is penalized more heavily because it allows ambiguous combinations of schemas, making it harder for consumers and AI agents to determine the correct shape. | | ||
| | Property count | Total schema properties across request and response. | | ||
| | Description coverage | Fraction of operations, parameters, and schema properties that have descriptions. | | ||
| | Ambiguous identifier count | Parameters with generic names (e.g. `id`, `name`, `type`) and no description. | | ||
| | Constraint coverage | Count of constraining keywords (`enum`, `format`, `pattern`, `minimum`, `maximum`, `minLength`, `maxLength`, `discriminator`, etc.). | | ||
| | Request/response example coverage | Whether request and response media types include `example` or `examples`. | | ||
| | Structured error response coverage | How many 4xx/5xx responses include a content schema or meaningful description. | | ||
| | Security scheme coverage | Whether operations reference documented security schemes with descriptions. | | ||
| | Cross-operation dependency depth | Inferred from shared `$ref` usage across operations. Operations that share many schemas form a dependency graph; deeper graphs indicate tightly coupled multi-step interactions. | | ||
|
|
||
| ### Subscores | ||
|
|
||
| The following subscores are normalized to 0–1 and combined into the composite Agent Readiness score: | ||
|
|
||
| `parameterSimplicity`, `schemaSimplicity`, `documentationQuality`, `constraintClarity`, `exampleCoverage`, `errorClarity`, `dependencyClarity`, `identifierClarity`, `polymorphismClarity`, `discoverability`. | ||
|
|
||
| The `discoverability` subscore reflects the total number of operations in the API. Larger APIs (approaching 1,000+ operations) receive a lower discoverability score because finding the right endpoint becomes harder for both humans and AI agents. | ||
|
|
||
| ### Hotspots | ||
|
|
||
| The command identifies the operations with the lowest scores and provides reasons such as: | ||
|
|
||
| - "High parameter count (N)" | ||
| - "Deep schema nesting (depth M)" | ||
| - "Polymorphism (anyOf) without discriminator" | ||
| - "Missing request and response examples" | ||
| - "No structured error responses (4xx/5xx)" | ||
| - "Missing operation description" | ||
|
|
||
| ## Usage | ||
|
|
||
| ```bash | ||
| redocly score <api> | ||
| redocly score <api> [--format=<option>] | ||
| ``` | ||
|
|
||
| ## Options | ||
|
|
||
| | Option | Type | Description | | ||
| | -------------------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | api | string | **REQUIRED.** Path to the API description filename or alias that you want to score. Refer to [the API section](#specify-api) for more details. | | ||
| | --config | string | Specify path to the [configuration file](../configuration/index.md). | | ||
| | --format | string | Format for the output.<br />**Possible values:** `stylish`, `json`. Default value is `stylish`. | | ||
| | --operation-details | boolean | Print a per-operation metrics table sorted by property count. | | ||
| | --debug-operation-id | string | Print a detailed schema breakdown for a specific operation (by `operationId` or `METHOD /path`). | | ||
| | --help | boolean | Show help. | | ||
| | --lint-config | string | Specify the severity level for the configuration file. <br/> **Possible values:** `warn`, `error`, `off`. Default value is `warn`. | | ||
| | --version | boolean | Show version number. | | ||
|
|
||
| ## Examples | ||
|
|
||
| ### Specify API | ||
|
|
||
| #### Pass an API directly | ||
|
|
||
| ```bash | ||
| redocly score openapi/openapi.yaml | ||
| ``` | ||
|
|
||
| ### Specify output format | ||
|
|
||
| #### Stylish output (default) | ||
|
|
||
| The default output format shows a human-readable summary in your terminal: | ||
|
|
||
| ```sh | ||
| Scores | ||
|
|
||
| Agent Readiness: 68.3/100 | ||
|
|
||
| Subscores | ||
|
|
||
| Parameter Simplicity [████████████████░░░░] 80% | ||
| Schema Simplicity [██████████████░░░░░░] 70% | ||
| Documentation Quality [████████████░░░░░░░░] 60% | ||
| Constraint Clarity [██████████░░░░░░░░░░] 50% | ||
| Example Coverage [████████████████████] 100% | ||
| Error Clarity [████████████████░░░░] 80% | ||
| Dependency Clarity [██████████████████░░] 90% | ||
| Identifier Clarity [████████████████████] 100% | ||
| Polymorphism Clarity [████████████████████] 100% | ||
| Discoverability [████████████████████] 100% | ||
|
|
||
| Top 3 Hotspot Operations | ||
|
|
||
| POST /orders (createOrder) | ||
| Agent Readiness: 38.7 | ||
| - High parameter count (12) | ||
| - Deep schema nesting (depth 6) | ||
| - Missing request and response examples | ||
|
|
||
| PUT /orders/{id} (updateOrder) | ||
| Agent Readiness: 44.0 | ||
| - Polymorphism (anyOf) without discriminator (3 anyOf) | ||
| - No structured error responses (4xx/5xx) | ||
| ``` | ||
|
|
||
| #### JSON output | ||
|
|
||
| Use `--format=json` for machine-readable output: | ||
|
|
||
| ```bash | ||
| redocly score openapi.yaml --format=json | ||
| ``` | ||
|
|
||
| The JSON output includes the full data: top-level scores, subscores, per-operation raw metrics, per-operation scores, dependency depths, and hotspot details with reasoning. | ||
|
|
||
| The JSON format is suitable for CI pipelines, quality gates, or feeding results into dashboards. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
41 changes: 41 additions & 0 deletions
41
packages/cli/src/commands/score/__tests__/collect-metrics-helper.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| import { | ||
| normalizeTypes, | ||
| getTypes, | ||
| resolveDocument, | ||
| BaseResolver, | ||
| Source, | ||
| type Document, | ||
| type SpecVersion, | ||
| type WalkContext, | ||
| } from '@redocly/openapi-core'; | ||
|
|
||
| import { collectMetrics, type CollectMetricsResult } from '../collect-metrics.js'; | ||
|
|
||
| /** | ||
| * Convenience wrapper that resolves a parsed OpenAPI document and collects metrics. | ||
| * Useful in tests where you don't already have resolved types. | ||
| */ | ||
| export async function collectDocumentMetrics( | ||
| parsed: Record<string, unknown>, | ||
| options?: { specVersion?: SpecVersion; debugOperationId?: string } | ||
| ): Promise<CollectMetricsResult> { | ||
| const specVersion: SpecVersion = options?.specVersion ?? 'oas3_0'; | ||
| const types = normalizeTypes(getTypes(specVersion), {}); | ||
| const source = new Source('score.yaml', JSON.stringify(parsed)); | ||
| const document: Document = { source, parsed }; | ||
| const externalRefResolver = new BaseResolver(); | ||
| const resolvedRefMap = await resolveDocument({ | ||
| rootDocument: document, | ||
| rootType: types.Root, | ||
| externalRefResolver, | ||
| }); | ||
| const ctx: WalkContext = { problems: [], specVersion, visitorsData: {} }; | ||
|
|
||
| return collectMetrics({ | ||
| document, | ||
| types, | ||
| resolvedRefMap, | ||
| ctx, | ||
| debugOperationId: options?.debugOperationId, | ||
| }); | ||
| } |
88 changes: 88 additions & 0 deletions
88
packages/cli/src/commands/score/__tests__/dependency-graph.test.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| import { computeDependencyDepths } from '../collectors/dependency-graph.js'; | ||
| import type { OperationMetrics } from '../types.js'; | ||
|
|
||
| function makeOp(path: string, method: string, refs: string[]): OperationMetrics { | ||
| return { | ||
| path, | ||
| method, | ||
| parameterCount: 0, | ||
| requiredParameterCount: 0, | ||
| paramsWithDescription: 0, | ||
| requestBodyPresent: false, | ||
| topLevelWritableFieldCount: 0, | ||
| maxRequestSchemaDepth: 0, | ||
| maxResponseSchemaDepth: 0, | ||
| polymorphismCount: 0, | ||
| anyOfCount: 0, | ||
| hasDiscriminator: false, | ||
| operationDescriptionPresent: false, | ||
| schemaPropertiesWithDescription: 0, | ||
| totalSchemaProperties: 0, | ||
| constraintCount: 0, | ||
| requestExamplePresent: false, | ||
| responseExamplePresent: false, | ||
| structuredErrorResponseCount: 0, | ||
| totalErrorResponses: 0, | ||
| ambiguousIdentifierCount: 0, | ||
| refsUsed: new Set(refs), | ||
| }; | ||
| } | ||
|
|
||
| describe('computeDependencyDepths', () => { | ||
| it('returns depth 0 for isolated operations', () => { | ||
| const ops = new Map([ | ||
| ['opA', makeOp('/a', 'get', ['#/components/schemas/A'])], | ||
| ['opB', makeOp('/b', 'get', ['#/components/schemas/B'])], | ||
| ]); | ||
| const depths = computeDependencyDepths(ops); | ||
| expect(depths.get('opA')).toBe(0); | ||
| expect(depths.get('opB')).toBe(0); | ||
| }); | ||
|
|
||
| it('returns depth 1 for two operations sharing a ref', () => { | ||
| const shared = '#/components/schemas/Shared'; | ||
| const ops = new Map([ | ||
| ['opA', makeOp('/a', 'get', [shared])], | ||
| ['opB', makeOp('/b', 'post', [shared])], | ||
| ]); | ||
| const depths = computeDependencyDepths(ops); | ||
| expect(depths.get('opA')).toBe(1); | ||
| expect(depths.get('opB')).toBe(1); | ||
| }); | ||
|
|
||
| it('returns depth 2 for a linear chain A-B-C', () => { | ||
| const ops = new Map([ | ||
| ['opA', makeOp('/a', 'get', ['#/schemas/AB'])], | ||
| ['opB', makeOp('/b', 'post', ['#/schemas/AB', '#/schemas/BC'])], | ||
| ['opC', makeOp('/c', 'put', ['#/schemas/BC'])], | ||
| ]); | ||
| const depths = computeDependencyDepths(ops); | ||
| expect(depths.get('opA')).toBe(2); | ||
| expect(depths.get('opC')).toBe(2); | ||
| expect(depths.get('opB')).toBeLessThanOrEqual(2); | ||
| }); | ||
|
|
||
| it('handles empty operations map', () => { | ||
| const depths = computeDependencyDepths(new Map()); | ||
| expect(depths.size).toBe(0); | ||
| }); | ||
|
|
||
| it('handles operations with no refs', () => { | ||
| const ops = new Map([['opA', makeOp('/a', 'get', [])]]); | ||
| const depths = computeDependencyDepths(ops); | ||
| expect(depths.get('opA')).toBe(0); | ||
| }); | ||
|
|
||
| it('groups all operations sharing the same ref', () => { | ||
| const shared = '#/components/schemas/Common'; | ||
| const ops = new Map([ | ||
| ['opA', makeOp('/a', 'get', [shared])], | ||
| ['opB', makeOp('/b', 'post', [shared])], | ||
| ['opC', makeOp('/c', 'put', [shared])], | ||
| ]); | ||
| const depths = computeDependencyDepths(ops); | ||
| expect(depths.get('opA')).toBe(1); | ||
| expect(depths.get('opB')).toBe(1); | ||
| expect(depths.get('opC')).toBe(1); | ||
| }); | ||
| }); |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.