Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
81e83a2
feat: add score command
adamaltman Mar 12, 2026
7c277f9
chore: add more tests
adamaltman Mar 12, 2026
f9bd3e7
refactor: address PR review feedback for score command
adamaltman Mar 12, 2026
a98ad59
fix: add missing version property to test CommandArgs
adamaltman Mar 12, 2026
95fe9f0
chore: add e2e
RomanHotsiy Mar 12, 2026
1fc409c
chore: refactor the visitor to use native pattern
RomanHotsiy Mar 12, 2026
f7d2699
chore: fix semantic line breaks
adamaltman Mar 12, 2026
54303d9
chore: refactor based on feedback
adamaltman Mar 13, 2026
3cb0f51
fix: adjust walking method and add debugging arguments
adamaltman Mar 13, 2026
a150a07
chore: address PR review feedback from tatomyr
adamaltman Mar 24, 2026
126f5e0
fix: resolve CI typecheck failures
adamaltman Mar 24, 2026
8439a76
refactor: extract collectMetrics and add schema memoization
adamaltman Mar 24, 2026
4376c02
chore: add median to raw metrics summary and fix label
adamaltman Mar 24, 2026
483a0aa
refactor: rename workflow to dependency across score command
adamaltman Mar 24, 2026
dbcdb86
chore: address PR review feedback
adamaltman Mar 27, 2026
4e5301f
chore: address tatomyr review feedback (round 3)
adamaltman Apr 1, 2026
916b03a
fix: address PR review feedback and bugbot findings
adamaltman Apr 2, 2026
6b9d017
fix: address bugbot findings (round 2)
adamaltman Apr 2, 2026
bd6d82a
refactor: consolidate Integration Simplicity and Agent Readiness into…
adamaltman Apr 4, 2026
47a9667
fix: address bugbot findings (round 3)
adamaltman Apr 7, 2026
edcd5a7
refactor: address tatomyr review feedback (round 4)
adamaltman Apr 8, 2026
64884e6
fix: address bugbot findings (round 4)
adamaltman Apr 8, 2026
db14ed7
Merge branch 'main' into aa/api-score
adamaltman Apr 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changeset/add-score-command.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
'@redocly/cli': minor
---

Added new `score` command that analyzes OpenAPI 3.x descriptions and produces an AI Agent Readiness score (0-100).
Reports normalized subscores, raw per-operation metrics, and top hotspot operations with human-readable explanations. Supports `--format=stylish` (default) and `--format=json` output.
5 changes: 3 additions & 2 deletions docs/@v2/commands/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@ Documentation commands:

API management commands:

- [`stats`](stats.md) Gather statistics for a document.
- [`bundle`](bundle.md) Bundle API description.
- [`split`](split.md) Split API description into a multi-file structure.
- [`join`](join.md) Join API descriptions [experimental feature].
- [`score`](score.md) Score an API for integration simplicity and AI agent readiness.
- [`split`](split.md) Split API description into a multi-file structure.
- [`stats`](stats.md) Gather statistics for a document.

Linting commands:

Expand Down
133 changes: 133 additions & 0 deletions docs/@v2/commands/score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# `score`

## Introduction

Comment thread
tatomyr marked this conversation as resolved.
The `score` command analyzes an OpenAPI description and produces a composite **Agent Readiness** score (0–100) that measures how easy the API is to integrate and how usable it is by AI agents and LLM-based tooling. Higher is better.

In addition to the top-level score, the command reports normalized subscores, raw metrics for every operation, and a list of **hotspot operations** — the endpoints most likely to cause integration friction — along with human-readable explanations.

{% admonition type="warning" name="Important" %}
The `score` command is considered an experimental feature. This means it's still a work in progress and may go through major changes.

The `score` command supports OpenAPI 3.x descriptions only.
{% /admonition %}

### Metrics

The following raw metrics are collected per operation and aggregated across the document:

| Metric | Description |
| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Parameter count | Total parameters (path, query, header, cookie) per operation. |
| Required parameter count | How many of those parameters are required. |
| Request body presence | Whether the operation defines a request body. |
| Top-level writable field count | Number of non-`readOnly` top-level properties in request schemas. |
| Max request/response schema depth | Deepest nesting level in request and response schemas. |
| Polymorphism count | Number of `oneOf`, `anyOf`, and `allOf` usages. `anyOf` is penalized more heavily because it allows ambiguous combinations of schemas, making it harder for consumers and AI agents to determine the correct shape. |
| Property count | Total schema properties across request and response. |
| Description coverage | Fraction of operations, parameters, and schema properties that have descriptions. |
| Ambiguous identifier count | Parameters with generic names (e.g. `id`, `name`, `type`) and no description. |
| Constraint coverage | Count of constraining keywords (`enum`, `format`, `pattern`, `minimum`, `maximum`, `minLength`, `maxLength`, `discriminator`, etc.). |
| Request/response example coverage | Whether request and response media types include `example` or `examples`. |
| Structured error response coverage | How many 4xx/5xx responses include a content schema or meaningful description. |
| Security scheme coverage | Whether operations reference documented security schemes with descriptions. |
| Cross-operation dependency depth | Inferred from shared `$ref` usage across operations. Operations that share many schemas form a dependency graph; deeper graphs indicate tightly coupled multi-step interactions. |

### Subscores

The following subscores are normalized to 0–1 and combined into the composite Agent Readiness score:

`parameterSimplicity`, `schemaSimplicity`, `documentationQuality`, `constraintClarity`, `exampleCoverage`, `errorClarity`, `dependencyClarity`, `identifierClarity`, `polymorphismClarity`, `discoverability`.

The `discoverability` subscore reflects the total number of operations in the API. Larger APIs (approaching 1,000+ operations) receive a lower discoverability score because finding the right endpoint becomes harder for both humans and AI agents.

### Hotspots

The command identifies the operations with the lowest scores and provides reasons such as:

- "High parameter count (N)"
- "Deep schema nesting (depth M)"
- "Polymorphism (anyOf) without discriminator"
- "Missing request and response examples"
- "No structured error responses (4xx/5xx)"
- "Missing operation description"

## Usage

```bash
redocly score <api>
redocly score <api> [--format=<option>]
```

## Options

| Option | Type | Description |
| -------------------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| api | string | **REQUIRED.** Path to the API description filename or alias that you want to score. Refer to [the API section](#specify-api) for more details. |
| --config | string | Specify path to the [configuration file](../configuration/index.md). |
| --format | string | Format for the output.<br />**Possible values:** `stylish`, `json`. Default value is `stylish`. |
| --operation-details | boolean | Print a per-operation metrics table sorted by property count. |
| --debug-operation-id | string | Print a detailed schema breakdown for a specific operation (by `operationId` or `METHOD /path`). |
| --help | boolean | Show help. |
| --lint-config | string | Specify the severity level for the configuration file. <br/> **Possible values:** `warn`, `error`, `off`. Default value is `warn`. |
| --version | boolean | Show version number. |

## Examples

### Specify API

#### Pass an API directly

```bash
redocly score openapi/openapi.yaml
```

### Specify output format

#### Stylish output (default)

The default output format shows a human-readable summary in your terminal:

```sh
Scores

Agent Readiness: 68.3/100

Subscores

Parameter Simplicity [████████████████░░░░] 80%
Schema Simplicity [██████████████░░░░░░] 70%
Documentation Quality [████████████░░░░░░░░] 60%
Constraint Clarity [██████████░░░░░░░░░░] 50%
Example Coverage [████████████████████] 100%
Error Clarity [████████████████░░░░] 80%
Dependency Clarity [██████████████████░░] 90%
Identifier Clarity [████████████████████] 100%
Polymorphism Clarity [████████████████████] 100%
Discoverability [████████████████████] 100%

Top 3 Hotspot Operations

POST /orders (createOrder)
Agent Readiness: 38.7
- High parameter count (12)
- Deep schema nesting (depth 6)
- Missing request and response examples

PUT /orders/{id} (updateOrder)
Agent Readiness: 44.0
- Polymorphism (anyOf) without discriminator (3 anyOf)
- No structured error responses (4xx/5xx)
```

#### JSON output

Use `--format=json` for machine-readable output:

```bash
redocly score openapi.yaml --format=json
```

The JSON output includes the full data: top-level scores, subscores, per-operation raw metrics, per-operation scores, dependency depths, and hotspot details with reasoning.

The JSON format is suitable for CI pipelines, quality gates, or feeding results into dashboards.
2 changes: 2 additions & 0 deletions docs/@v2/v2.sidebars.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@
page: commands/push-status.md
- label: respect
page: commands/respect.md
- label: score
page: commands/score.md
- label: scorecard-classic
page: commands/scorecard-classic.md
- label: split
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import {
normalizeTypes,
getTypes,
resolveDocument,
BaseResolver,
Source,
type Document,
type SpecVersion,
type WalkContext,
} from '@redocly/openapi-core';

import { collectMetrics, type CollectMetricsResult } from '../collect-metrics.js';

/**
* Convenience wrapper that resolves a parsed OpenAPI document and collects metrics.
* Useful in tests where you don't already have resolved types.
*/
export async function collectDocumentMetrics(
parsed: Record<string, unknown>,
options?: { specVersion?: SpecVersion; debugOperationId?: string }
): Promise<CollectMetricsResult> {
const specVersion: SpecVersion = options?.specVersion ?? 'oas3_0';
const types = normalizeTypes(getTypes(specVersion), {});
const source = new Source('score.yaml', JSON.stringify(parsed));
const document: Document = { source, parsed };
const externalRefResolver = new BaseResolver();
const resolvedRefMap = await resolveDocument({
rootDocument: document,
rootType: types.Root,
externalRefResolver,
});
const ctx: WalkContext = { problems: [], specVersion, visitorsData: {} };

return collectMetrics({
document,
types,
resolvedRefMap,
ctx,
debugOperationId: options?.debugOperationId,
});
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
import { computeDependencyDepths } from '../collectors/dependency-graph.js';
import type { OperationMetrics } from '../types.js';

function makeOp(path: string, method: string, refs: string[]): OperationMetrics {
return {
path,
method,
parameterCount: 0,
requiredParameterCount: 0,
paramsWithDescription: 0,
requestBodyPresent: false,
topLevelWritableFieldCount: 0,
maxRequestSchemaDepth: 0,
maxResponseSchemaDepth: 0,
polymorphismCount: 0,
anyOfCount: 0,
hasDiscriminator: false,
operationDescriptionPresent: false,
schemaPropertiesWithDescription: 0,
totalSchemaProperties: 0,
constraintCount: 0,
requestExamplePresent: false,
responseExamplePresent: false,
structuredErrorResponseCount: 0,
totalErrorResponses: 0,
ambiguousIdentifierCount: 0,
refsUsed: new Set(refs),
};
}

describe('computeDependencyDepths', () => {
it('returns depth 0 for isolated operations', () => {
const ops = new Map([
['opA', makeOp('/a', 'get', ['#/components/schemas/A'])],
['opB', makeOp('/b', 'get', ['#/components/schemas/B'])],
]);
const depths = computeDependencyDepths(ops);
expect(depths.get('opA')).toBe(0);
expect(depths.get('opB')).toBe(0);
});

it('returns depth 1 for two operations sharing a ref', () => {
const shared = '#/components/schemas/Shared';
const ops = new Map([
['opA', makeOp('/a', 'get', [shared])],
['opB', makeOp('/b', 'post', [shared])],
]);
const depths = computeDependencyDepths(ops);
expect(depths.get('opA')).toBe(1);
expect(depths.get('opB')).toBe(1);
});

it('returns depth 2 for a linear chain A-B-C', () => {
const ops = new Map([
['opA', makeOp('/a', 'get', ['#/schemas/AB'])],
['opB', makeOp('/b', 'post', ['#/schemas/AB', '#/schemas/BC'])],
['opC', makeOp('/c', 'put', ['#/schemas/BC'])],
]);
const depths = computeDependencyDepths(ops);
expect(depths.get('opA')).toBe(2);
expect(depths.get('opC')).toBe(2);
expect(depths.get('opB')).toBeLessThanOrEqual(2);
});

it('handles empty operations map', () => {
const depths = computeDependencyDepths(new Map());
expect(depths.size).toBe(0);
});

it('handles operations with no refs', () => {
const ops = new Map([['opA', makeOp('/a', 'get', [])]]);
const depths = computeDependencyDepths(ops);
expect(depths.get('opA')).toBe(0);
});

it('groups all operations sharing the same ref', () => {
const shared = '#/components/schemas/Common';
const ops = new Map([
['opA', makeOp('/a', 'get', [shared])],
['opB', makeOp('/b', 'post', [shared])],
['opC', makeOp('/c', 'put', [shared])],
]);
const depths = computeDependencyDepths(ops);
expect(depths.get('opA')).toBe(1);
expect(depths.get('opB')).toBe(1);
expect(depths.get('opC')).toBe(1);
});
});
Loading
Loading