Skip to content

Commit a7b9e2f

Browse files
vishalg0wdaclaude
andauthored
feat(oq): fix query bugs and add schema content fields for first-class OpenAPI queries (#182)
* fix(oq): support single-quoted strings and fix group-by pipeline filtering Single-quoted strings in expressions (e.g. `name == 'Foo'`) were silently parsed as field references, causing where/select predicates to return empty results. This made refs-out, blast-radius, and matches appear broken when users naturally used single quotes. Group-by, cycles, and clusters wrote results only to result.Groups but not result.Rows, so downstream pipeline stages (where, sort, take, count) saw zero rows and produced empty output. Now all group-producing stages emit both Groups and Rows with a new GroupRowResult kind. Also adds query and query-reference subcommands to README. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(oq): add schema content and operation response fields for first-class OpenAPI queries * feat: add format(yaml) output mode for raw YAML node access * feat: add parent stage and rename edge fields to via/key/from * fix: address PR feedback — single-quote edge cases, GroupRowResult fields, parser quote handling * refactor: remove legacy edge_kind/edge_label/edge_from aliases * fix: gofmt exec.go * chore: update cmd/openapi dependency to latest root module * fix: remove unused fields assignment in FormatYAML to fix ineffassign lint Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review — single-quote handling in matches, splitKeywordCall, findUnquotedSemicolon - Add stripQuotes helper to properly handle \x00-prefixed single-quote tokens in both infix and function-call forms of matches() - Update splitKeywordCall to track single-quoted strings when matching parentheses, preventing parse failures with parens inside quotes - Update findUnquotedSemicolon to track single-quoted strings, preventing incorrect semicolon splitting inside quoted strings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: remove legacy oq syntax, split yaml format into emit stage Legacy syntax removed: - where → select(expr) - sort → sort_by(field; desc) - take/head → first(N) - select <fields> → pick <fields> - group-by → group_by(field) - count → length New emit stage: - Replaces format(yaml) for raw YAML node extraction - Wraps output under the schema/operation key name - Separated from output format concerns (table/json/markdown/toon) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: gofmt oq.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: preserve FormatHint and EmitYAML through execWhere and execUnique Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add query-reference cross-reference to spec query help text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: propagate FormatHint/EmitYAML through all exec functions, add escape handling consistency - Add deriveResult helper to consistently propagate Fields, FormatHint, and EmitYAML when creating new Result objects - Replace all 16 bare &Result{Fields: result.Fields} with deriveResult - Add backslash escape handling to findUnquotedSemicolon and splitKeywordCall for consistency with splitPipeline/splitSemicolonArgs - Move examples to cobra Example field, clean up help text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: move pipeline stages, operators, and examples after Usage section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: restructure query help — Usage+Flags first, stages, then examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add pipeline structure hint to query help Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add string ops, arithmetic, contains, and count() to expression system String functions (composable — take expressions, not just field names): lower(expr), upper(expr), trim(expr), len(expr) startswith(expr, prefix), endswith(expr, suffix) contains(expr, substr), replace(expr, old, new) split(expr, sep) → array, split(expr, sep, N) → Nth segment Arithmetic operators: +, -, *, / with correct precedence contains infix operator: field contains "value" Works on both strings and arrays count() expression function: count(field) returns array/string length Array value type (KindArray): tags, required, enum fields now return arrays select(required contains "id"), select(tags contains "billing") Function composition works naturally: select(startswith(lower(name), "p")) select(split(path, "/", 1) == "users") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add coverage for string ops, arithmetic, contains, and array values Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: respect pick field projection after group_by When pick sets explicit fields, group results now go through the standard field-based formatter instead of the hardcoded group formatter. This allows queries like: group_by(type) | pick key, names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(oq): simplify query implementation — deduplicate parsing, traversals, and expressions - Unify splitPipeline/splitSemicolonArgs into generic splitAtDelim - Deduplicate identical isCall/!isCall branches in first/last/sample/neighbors - Remove redundant format isCall no-op assignment - Unify traverseRefsOut/traverseProperties/traverseItems via traverseOutEdges - Extract nodeIDsToRows from traverseReachable/traverseAncestors - Merge identical count()/len() expression functions - Remove dead StageWhere case from execStage (unreachable via execStageWithEnv) - Fix perfsprint lint: use e.Value directly instead of fmt.Sprintf Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(oq): add navigation stages for parameters, responses, content-types, headers Implements the navigation model from DESIGN.md: - New row types: ParameterResult, ResponseResult, RequestBodyResult, ContentTypeResult, HeaderResult - New stages: parameters, responses, request-body, content-types, headers, schema (singular), operation (back-nav) - Context propagation: child rows inherit status_code, operation name - SchemaByPtr on graph for bridging nav rows back to schema graph - Remove schemas.components and schemas.inline sources (use select()) - Fix emit to use path instead of name for YAML key attribution Enables queries like: operations | responses | content-types | select(media_type == "text/event-stream") | operation | unique operations | parameters | select(in == "cookie") | pick name, in, operation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update query help text, reference, and READMEs for navigation stages - Update query command help to show navigation stages and remove schemas.components/schemas.inline sources - Add parameter, response, request-body, content-type, and header fields documentation to query-reference - Add navigation examples to all READMEs - Update oq/README.md with navigation stages and new row type fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(oq): add components.* sources, security stages, and emit for nav rows - Add components.schemas, components.parameters, components.responses, components.request-bodies, components.headers, components.security-schemes as pipeline sources for querying reusable OpenAPI component definitions - Add security stage: operations | security yields SecurityRequirementResult rows with scheme_name, scheme_type, scopes, scope_count fields - Add SecuritySchemeResult row type for components.security-schemes source with name, type, in, scheme, bearer_format, description, has_flows fields - Fix emit for navigation rows (responses, parameters, etc.) — getRootNode now handles all row types instead of returning nil - Store Index on SchemaGraph for component/security access Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(oq): resolve $ref in schema stage, inherit global security, fix unique after pick Bug fixes: - B1: unique now deduplicates by projected field values when pick is active, falling back to row identity when no projection is set - B2: schema stage follows $ref edges via resolveRefTarget to return the actual component schema instead of the inline $ref wrapper node - B3: security stage inherits global security requirements when an operation has no per-operation security (nil vs empty array distinction) - D3: emit uses contextual compound keys for nav rows (e.g., "listEntities/200" for responses, "createPet/parameters/name" for params) Tests: - Extend petstore fixture with security schemes, deprecated parameters, SSE endpoint, headers, multiple content types, component params/responses - Add 20 new tests covering navigation stages, security inheritance, security opt-out, $ref resolution, unique-after-pick, content-type dedup, response headers, deprecated parameters, SSE queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(oq): rename ParamName to ComponentKey, fix component response names, add group_by name field - Rename Row.ParamName → Row.ComponentKey for consistent component key storage across all components.* sources - Fix components.responses: use ComponentKey instead of StatusCode for the component map key name (B4) - Fix RequestBodyResult.name: show ComponentKey for component request bodies instead of hardcoded "request-body" - Add group_by(field; name_field) syntax: optional second arg specifies which field to collect as group names (default: "name") (D2) - Add tests: component response name/status_code separation, group_by with name_field - Update AUDIT.md: mark B1-B4, D2, D3 as fixed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove audit doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(oq): guard group_by semicolon args against nil slice access Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(oq): add depth-limited reachable(N) for bounded traversal reachable without args remains unlimited (full transitive closure). reachable(N) limits BFS to N hops from the seed schemas. Examples: schema | reachable(1) → direct children only schema | reachable(2) → children + their children (follows $refs) schema | reachable → everything (unchanged behavior) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update query help and reference for reachable(N), components.*, security, group_by name field - Add components.* sources, security stage, reachable(N) to help text - Add security scheme and security requirement field documentation - Document group_by(field; name_field) and unique-after-pick behavior - Update expression docs: contains, string functions, arithmetic, single quotes - Overhaul examples: organized by category (schema analysis, operations & navigation, security, content auditing, advanced) with representative queries covering the full feature set - Update CLI usage examples to showcase navigation and security queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 5083f6f commit a7b9e2f

19 files changed

Lines changed: 3534 additions & 664 deletions

File tree

AGENTS.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,15 @@ If you add new packages to the root module (e.g., `oq/`, `graph/`) that `cmd/ope
138138

139139
This gives `cmd/openapi/go.mod` a pseudo-version (e.g., `v1.19.6-0.20260312183335-395c19cd8edd`) that resolves correctly both locally and in CI. Each subsequent push that changes the root module requires repeating step 2 with the new commit SHA.
140140

141+
## CLI Documentation
142+
143+
When adding or modifying subcommands under `openapi spec`, you **must** update both:
144+
145+
1. `README.md` (root) — the command list and Quick Examples section
146+
2. `cmd/openapi/commands/openapi/README.md` — detailed command documentation with examples, flags, and usage patterns
147+
148+
The command README (`cmd/openapi/commands/openapi/README.md`) serves as the primary reference for each subcommand and should include usage examples, flag tables, and before/after demonstrations where applicable.
149+
141150
## Linter Rules
142151

143152
This project uses `golangci-lint` with strict rules. Run `mise lint` to check. The most common violations are listed below. **When you encounter a new common lint pattern not documented here, add it to this section so future sessions avoid the same mistakes.**

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,8 @@ The CLI provides four main command groups:
134134
- `snip` - Remove selected operations from an OpenAPI specification (interactive or CLI)
135135
- `upgrade` - Upgrade an OpenAPI specification to the latest supported version
136136
- `validate` - Validate an OpenAPI specification document
137+
- `query` - Query an OpenAPI specification using the [oq pipeline language](./oq/README.md) to answer structural and semantic questions about schemas and operations
138+
- `query-reference` - Print the complete oq query language reference
137139

138140
- **`openapi swagger`** - Commands for working with Swagger 2.0 documents ([documentation](./cmd/openapi/commands/swagger/README.md))
139141
- `validate` - Validate a Swagger 2.0 specification document
@@ -179,6 +181,12 @@ openapi swagger validate ./api.swagger.yaml
179181

180182
# Upgrade Swagger 2.0 to OpenAPI 3.0
181183
openapi swagger upgrade ./api.swagger.yaml ./openapi.yaml
184+
185+
# Query schema graph — find deeply nested components
186+
openapi spec query 'schemas | select(is_component) | sort_by(depth; desc) | first(10) | pick name, depth' ./spec.yaml
187+
188+
# Query schema graph — blast radius of a schema change
189+
openapi spec query 'schemas | select(name == "Error") | blast-radius | length' ./spec.yaml
182190
```
183191

184192
For detailed usage instructions for each command group, see the individual documentation linked above.

cmd/openapi/commands/openapi/README.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ OpenAPI specifications define REST APIs in a standard format. These commands hel
2424
- [`localize`](#localize)
2525
- [`explore`](#explore)
2626
- [`snip`](#snip)
27+
- [`query`](#query)
28+
- [`query-reference`](#query-reference)
2729
- [Common Options](#common-options)
2830
- [Output Formats](#output-formats)
2931
- [Examples](#examples)
@@ -1196,6 +1198,62 @@ components:
11961198
- You want to reduce the size and complexity of a specification
11971199
- You need to create different API variants from a single source
11981200

1201+
### `query`
1202+
1203+
Query an OpenAPI specification using the oq pipeline language to answer structural and semantic questions about schemas and operations.
1204+
1205+
```bash
1206+
# Find deeply nested components
1207+
openapi spec query 'schemas | select(is_component) | sort_by(depth; desc) | first(10) | pick name, depth' ./spec.yaml
1208+
1209+
# OneOf unions missing discriminator
1210+
openapi spec query 'schemas | select(is_component and union_width > 0 and not has_discriminator) | pick name, union_width' ./spec.yaml
1211+
1212+
# Schemas missing descriptions
1213+
openapi spec query 'schemas | select(is_component and not has_description) | pick name, type' ./spec.yaml
1214+
1215+
# Operations missing error responses
1216+
openapi spec query 'operations | select(not has_error_response) | pick name, method, path' ./spec.yaml
1217+
1218+
# Blast radius — what breaks if I change a schema?
1219+
openapi spec query 'schemas | select(name == "Error") | blast-radius | length' ./spec.yaml
1220+
1221+
# Duplicate inline schemas
1222+
openapi spec query 'schemas | select(not is_component) | group_by(hash) | select(count > 1)' ./spec.yaml
1223+
1224+
# Navigate to a single schema by name
1225+
openapi spec query 'schemas | select(name == "Pet") | explain' ./spec.yaml
1226+
1227+
# List all enum schemas
1228+
openapi spec query 'schemas | select(is_component and enum_count > 0) | pick name, enum_count' ./spec.yaml
1229+
1230+
# Pipe from stdin
1231+
cat spec.yaml | openapi spec query 'schemas | length'
1232+
1233+
# Read query from file
1234+
openapi spec query -f analysis.oq ./spec.yaml
1235+
1236+
# Output as JSON
1237+
openapi spec query --format json 'schemas | select(is_component) | first(5)' ./spec.yaml
1238+
```
1239+
1240+
**Flags:**
1241+
1242+
| Flag | Short | Description |
1243+
|------|-------|-------------|
1244+
| `--format` | | Output format: `table` (default), `json`, `markdown`, or `toon` |
1245+
| `--file` | `-f` | Read query from file instead of argument |
1246+
1247+
For the full query language reference, run `openapi spec query-reference`.
1248+
1249+
### `query-reference`
1250+
1251+
Print the complete reference for the oq pipeline query language, including all sources, stages, fields, operators, and examples.
1252+
1253+
```bash
1254+
openapi spec query-reference
1255+
```
1256+
11991257
## Common Options
12001258

12011259
All commands support these common options:

cmd/openapi/commands/openapi/query.go

Lines changed: 59 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -17,60 +17,39 @@ var queryCmd = &cobra.Command{
1717
Use: "query <query> [input-file]",
1818
Short: "Query an OpenAPI specification using the oq pipeline language",
1919
Long: `Query an OpenAPI specification using the oq pipeline language to answer
20-
structural and semantic questions about schemas and operations.
21-
22-
The query argument comes first, followed by an optional input file. If no file
23-
is given, reads from stdin.
24-
25-
Examples:
26-
# Deeply nested components (jq-style syntax)
27-
openapi spec query 'schemas.components | sort_by(depth; desc) | first(10) | pick name, depth' petstore.yaml
28-
29-
# Pipe from stdin
30-
cat spec.yaml | openapi spec query 'schemas | count'
31-
32-
# Explicit stdin
33-
openapi spec query 'schemas | count' -
34-
35-
# Filter with select()
36-
openapi spec query 'schemas | select(union_width > 0) | sort_by(union_width; desc) | first(10)' petstore.yaml
37-
38-
# Dead components (no incoming references)
39-
openapi spec query 'schemas.components | select(in_degree == 0) | pick name' petstore.yaml
40-
41-
# Variable binding — exclude seed from reachable results
42-
openapi spec query 'schemas | select(name == "Pet") | let $pet = name | reachable | select(name != $pet)' petstore.yaml
43-
44-
# User-defined functions
45-
openapi spec query 'def hot: select(in_degree > 5); schemas.components | hot | pick name' petstore.yaml
46-
47-
# Alternative operator — fallback for null/falsy values
48-
openapi spec query 'schemas | select(name // "none" != "none")' petstore.yaml
49-
50-
# If-then-else conditional
51-
openapi spec query 'schemas | select(if is_component then depth > 3 else true end)' petstore.yaml
52-
53-
# Blast radius
54-
openapi spec query 'schemas.components | select(name == "Error") | blast-radius | length' petstore.yaml
55-
56-
# Explain a query plan
57-
openapi spec query 'schemas.components | select(depth > 5) | sort_by(depth; desc) | explain' petstore.yaml
58-
59-
Pipeline stages (jq-style):
60-
Source: schemas, schemas.components, schemas.inline, operations
61-
Traversal: refs-out, refs-in, reachable, ancestors, properties, union-members, items,
62-
ops, schemas, path(A; B), connected, blast-radius, neighbors(N)
63-
Analysis: orphans, leaves, cycles, clusters, tag-boundary, shared-refs
64-
Filter: select(expr), pick <fields>, sort_by(field; desc), first(N), last(N),
65-
sample(N), top(N; field), bottom(N; field), unique, group_by(field), length
66-
Variables: let $var = expr
67-
Functions: def name: body; def name($p): body; include "file.oq";
68-
Meta: explain, fields, format(table|json|markdown|toon)
69-
70-
Legacy syntax (where, sort, take, head, select fields, group-by, count) is still supported.
71-
72-
Expression operators: ==, !=, >, <, >=, <=, and, or, not, //, has(), matches,
73-
if-then-else-end, string interpolation \(expr)`,
20+
structural and semantic questions about schemas, operations, parameters,
21+
responses, content types, and headers.`,
22+
Example: `Queries are pipelines: source | stage | stage | ...
23+
24+
Pipeline stages:
25+
Source: schemas, operations, components.schemas, components.parameters,
26+
components.responses, components.request-bodies, components.headers,
27+
components.security-schemes
28+
Navigation: parameters, responses, request-body, content-types, headers,
29+
schema, operation, security
30+
Traversal: refs-out, refs-in, reachable, reachable(N), ancestors, properties,
31+
union-members, items, parent, ops, schemas, path(A; B), connected,
32+
blast-radius, neighbors(N)
33+
Analysis: orphans, leaves, cycles, clusters, tag-boundary, shared-refs
34+
Filter: select(expr), pick <fields>, sort_by(field; desc), first(N), last(N),
35+
sample(N), top(N; field), bottom(N; field), unique,
36+
group_by(field), group_by(field; name_field), length
37+
Variables: let $var = expr
38+
Functions: def name: body; def name($p): body; include "file.oq";
39+
Output: emit, format(table|json|markdown|toon)
40+
Meta: explain, fields
41+
42+
Operators: ==, !=, >, <, >=, <=, and, or, not, //, has(), matches, contains,
43+
if-then-else-end, \(interpolation), lower(), upper(), len(), split()
44+
45+
openapi spec query 'operations | responses | content-types | select(media_type == "text/event-stream") | operation | unique' spec.yaml
46+
openapi spec query 'operations | security | group_by(scheme_type; operation)' spec.yaml
47+
openapi spec query 'schemas | select(is_component) | sort_by(depth; desc) | first(10) | pick name, depth' spec.yaml
48+
openapi spec query 'operations | select(name == "createUser") | request-body | content-types | schema | reachable(2) | emit' spec.yaml
49+
openapi spec query 'components.security-schemes | pick name, type, scheme' spec.yaml
50+
cat spec.yaml | openapi spec query 'schemas | length'
51+
52+
For the full query language reference, run: openapi spec query-reference`,
7453
Args: queryArgs(),
7554
Run: runQuery,
7655
}
@@ -81,6 +60,25 @@ var queryFromFile string
8160
func init() {
8261
queryCmd.Flags().StringVar(&queryOutputFormat, "format", "table", "output format: table, json, markdown, or toon")
8362
queryCmd.Flags().StringVarP(&queryFromFile, "file", "f", "", "read query from file instead of argument")
63+
64+
// Custom help template: Usage + Flags together, then Examples last
65+
queryCmd.SetUsageTemplate(`Usage:{{if .Runnable}}
66+
{{.UseLine}}{{end}}{{if .HasAvailableSubCommands}}
67+
{{.CommandPath}} [command]{{end}}{{if gt (len .Aliases) 0}}
68+
69+
Aliases:
70+
{{.NameAndAliases}}{{end}}{{if .HasAvailableLocalFlags}}
71+
72+
Flags:
73+
{{.LocalFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}{{if .HasAvailableInheritedFlags}}
74+
75+
Global Flags:
76+
{{.InheritedFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}{{if .HasExample}}
77+
78+
{{.Example}}{{end}}{{if .HasAvailableSubCommands}}
79+
80+
Use "{{.CommandPath}} [command] --help" for more information about a command.{{end}}
81+
`)
8482
}
8583

8684
func runQuery(cmd *cobra.Command, args []string) {
@@ -146,6 +144,13 @@ func queryOpenAPI(ctx context.Context, processor *OpenAPIProcessor, queryStr str
146144
return fmt.Errorf("query error: %w", err)
147145
}
148146

147+
// Emit stage outputs raw YAML nodes, bypassing format selection
148+
if result.EmitYAML {
149+
output := oq.FormatYAML(result, g)
150+
fmt.Fprint(processor.stdout(), output)
151+
return nil
152+
}
153+
149154
// Format and output — inline format stage overrides CLI flag
150155
format := queryOutputFormat
151156
if result.FormatHint != "" {

0 commit comments

Comments
 (0)