Change CLI output modes so that:
--output textstays as the current human-readable output--output jsonlbecomes the current event-stream output (today exposed asjson)--output jsonbecomes a new structured JSON document representing the tool's primary result--output rawstays unchanged
This plan is based on the current implementation, not the stale docs/dev/RENDERING_PIPELINE_REFACTOR.md redesign.
- Rename the current
jsonoutput mode tojsonl - Add a new structured
jsonmode - Preserve current text output behavior without regressions
- Preserve current event-stream behavior without regressions
- Exclude auxiliary output from structured JSON:
- no next steps
- no frontmatter
- no event stream passthrough
- Standardize structured JSON around a common envelope
- Codify per-tool outputs as versioned schemas
- Add snapshot coverage for structured JSON output
- Prefer the cleanest, lowest-churn design over a broad pipeline rewrite
The current flow is already good enough to support this change incrementally:
- tools emit
PipelineEvents throughToolHandlerContext.emit - CLI creates a
RenderSession - the session accumulates:
- events
- attachments
- error state
postProcessSession()appendsnext-stepsas another event- CLI decides how output is rendered/printed based on
--output
Today:
--output textusescli-text--output rawusestext--output jsonusescli-json, which actually prints one JSON event per line
So current json is semantically JSONL already.
Implement structured JSON as a CLI-boundary projection over the final RenderSession.
That means:
- run the tool normally
- let it emit its normal
PipelineEvents - let daemon-backed tools replay into the same CLI-side session as they do today
- let
postProcessSession()run as it does today - for
--output json, build one structured JSON object from the final session state
This avoids the wrong kind of complexity.
We should not:
- rewrite tool handlers to return new result types
- refactor MCP output
- change daemon protocol
- add a separate deep rendering pipeline for structured JSON
- parse rendered text back into JSON
We should:
- keep text and jsonl on the existing event/session path
- add structured JSON only at the final CLI output boundary
This keeps the design simple:
- one event production path
- one session capture model
- two presentation styles from the same source:
- streamed event output (
jsonl) - final projected output (
json)
- streamed event output (
All structured JSON responses should use one envelope shape:
{
"schema": "xcodebuildmcp.output.simulator-list",
"schemaVersion": "1",
"didError": false,
"error": null,
"data": {}
}schema- stable identifier for the payload contract
- example:
xcodebuildmcp.output.simulator-list
schemaVersion- manual contract version
- start at
"1" - bump only for breaking changes
didError- whether the command failed
error- standardized human-readable failure string
nullon success
data- the primary structured result
nullwhen no meaningful primary data exists
- emit exactly one JSON document
- pretty-print with 2-space indentation
- include a trailing newline
- never include next steps
- never include
_meta - never include raw
PipelineEvent[] - never include rendered human text blocks just to mirror the text mode
Structured JSON should always remain parseable.
If the tool fails:
{
"schema": "xcodebuildmcp.output.app-path",
"schemaVersion": "1",
"didError": true,
"error": "App path lookup failed",
"data": null
}If structured-output generation itself fails internally, still emit a valid envelope with:
didError: trueerror: "Structured output generation failed: ..."data: null
Define structured output schemas in code, not in manifest YAML.
Recommended module layout:
src/structured-output/types.tssrc/structured-output/helpers.tssrc/structured-output/registry.tssrc/structured-output/index.tssrc/structured-output/definitions/common.ts- workflow-specific definition files:
coverage.tsdebugging.tsdevice.tsdoctor.tsmacos.tsproject-discovery.tsproject-scaffolding.tssession-management.tssimulator.tssimulator-management.tsswift-package.tsui-automation.tsutilities.tsworkflow-discovery.tsxcode-ide.ts
Each schema definition should include:
- schema id
- schema version
- associated tool ids
- a Zod schema for the
datapayload - a builder that projects data from the final event session
Conceptually:
interface StructuredOutputDefinition<TData> {
schema: string;
schemaVersion: string;
toolKeys: readonly string[];
dataSchema: z.ZodType<TData>;
build(input: StructuredOutputBuildInput, view: StructuredEventView): TData | null;
}Create one indexed event view from the final session events, excluding next-steps.
That view should expose grouped access to:
- header
- status lines
- summaries
- detail trees
- tables
- sections
- file refs
- compiler errors
- compiler warnings
- test discovery
- test failures
This avoids scattered ad hoc event scanning in every tool definition.
Not every tool should get its own bespoke data shape.
A core goal of this design is that consumers should be able to write a small number of parsers for shared output families, not hundreds of parsers for equivalent data exposed by different tools.
That means:
- prefer shared schema families over per-tool schemas
- prefer shared field names across related tool results
- use tool-specific schemas only when the primary data is genuinely different
- compose from common sub-schemas rather than inventing new top-level structures casually
Examples:
xcodebuildmcp.output.app-pathget_sim_app_pathget_device_app_pathget_mac_app_path
xcodebuildmcp.output.bundle-idget_app_bundle_idget_mac_bundle_id
xcodebuildmcp.output.simulator-list- simulator list commands with the same payload shape
xcodebuildmcp.output.launch-resultlaunch_app_simlaunch_app_devicelaunch_mac_app
xcodebuildmcp.output.build-resultbuild_simbuild_devicebuild_macosswift_package_build
Use shared contracts where it reduces duplication cleanly.
The doc should explicitly bias toward reusable base shapes.
Recommended common sub-schemas:
summary- status
- duration
- optional test counts
diagnostics- warnings
- errors
- testFailures when relevant
artifacts- appPath
- bundleId
- processId
- simulatorId
- deviceId
- buildLogPath
- runtimeLogPath
- osLogPath
entries- ordered key/value outputs
items- normalized list outputs
The important rule is consistency:
- if
appPathis part of a result, it should live in the same place for every schema family that uses it - if
processIdis part of a result, it should live in the same place for every schema family that uses it - if
buildLogPathis emitted, it should not move around between sibling schemas
For durable outputs that are artifacts of the command result, prefer nesting them under artifacts rather than placing them at the top level opportunistically.
That means the plan should normalize toward shapes like:
{
"data": {
"summary": {
"status": "SUCCEEDED",
"durationMs": 5234
},
"artifacts": {
"appPath": "...",
"bundleId": "com.example.CalculatorApp",
"processId": 12345,
"simulatorId": "...",
"buildLogPath": "..."
},
"diagnostics": {
"warnings": [],
"errors": []
}
}
}For simple lookup-only results, a reduced version of the same idea still applies:
{
"data": {
"artifacts": {
"appPath": "..."
}
}
}This is preferable to having one tool emit data.appPath and another emit data.artifacts.appPath unless there is a very strong reason.
The implementation should begin with an audit that groups tools into shared result families before any code is written.
A recommended initial family matrix is:
| Schema family | Shared data shape intent |
Example tools |
|---|---|---|
app-path |
artifacts.appPath |
get_sim_app_path, get_device_app_path, get_mac_app_path |
bundle-id |
artifacts.bundleId |
get_app_bundle_id, get_mac_bundle_id |
launch-result |
artifacts.bundleId, artifacts.processId, target id |
launch_app_sim, launch_app_device, launch_mac_app |
install-result |
target id + installed artifact identity where available | install_app_sim, install_app_device |
stop-result |
target id + stopped process/app identity where available | stop_app_sim, stop_app_device, stop_mac_app, swift_package_stop |
build-result |
summary, artifacts, diagnostics |
build_sim, build_device, build_macos, swift_package_build, clean |
test-result |
summary, diagnostics, optional discoveries/artifacts |
test_sim, test_device, test_macos, swift_package_test |
build-run-result |
summary, artifacts, diagnostics |
build_run_sim, build_run_device, build_run_macos, possibly swift_package_run if it fits cleanly |
simulator-list |
simulators[] normalized records |
list_sims |
device-list |
devices[] normalized records |
list_devices |
scheme-list |
schemes[] |
list_schemes |
project-list |
projects[] |
discover_projects |
settings-entries |
ordered entries[] |
show_build_settings, session/defaults-style key-value outputs where appropriate |
coverage-result |
summary + coverage entries | get_coverage_report, get_file_coverage |
ui-action-result |
action target + artifact refs if produced | tap, swipe, touch, long_press, button, gesture, type_text, key_press, key_sequence |
capture-result |
artifact paths and capture metadata | screenshot, snapshot_ui, record_sim_video |
normalized-content |
generic fallback for proxy/dynamic tools | dynamic xcode-ide style tools |
This matrix should be refined from a real tool audit, but the principle should not change: group by result semantics, not by command name.
Every new schema starts at version "1".
Examples:
- renaming a field
- removing a field
- changing a field type
- changing the structure of arrays/objects incompatibly
- changing semantic meaning of an existing field
Examples:
- internal extraction refactors
- text formatting changes
- adding truly optional fields
Each data payload should be validated against a strict Zod schema before emission.
That gives us:
- fail-fast schema drift detection
- a clear version boundary
- confidence that fixtures match actual contracts
Derive structured JSON in the CLI command handler after tool invocation completes.
Practically, this means wiring it in src/cli/register-tool-commands.ts after await invoker.invokeDirect(...) returns.
At that point the CLI already has:
session.getEvents()session.getAttachments()session.isError()
That is the cleanest seam because it works equally for:
- direct CLI tools
- daemon-backed CLI tools
without changing daemon or MCP contracts.
Structured JSON builders must ignore:
next-steps- any future auxiliary presentation-only event types
The structured payload should represent only the tool's primary result.
Prefer data extraction in this order:
- typed event fields directly
- detail tree items
- table rows
- file refs
- summaries
- test/diagnostic events
- command args where the result is basically the performed action
- tool-specific parsing of
SectionEvent.titleandSectionEvent.linesonly when needed - never parse rendered text output
This is an important constraint: structured JSON should be projected from the event model, not reverse-engineered from final text.
Change CLI output choices from:
textjsonraw
to:
textjsonjsonlraw
text: current human-readable CLI outputjsonl: current event stream output, one JSON event per linejson: new structured result enveloperaw: unchanged
Rename internal render strategy naming to match the new CLI terminology:
cli-json->cli-jsonl
Behavior remains the same.
For json mode:
- use a silent capture session rather than the streaming JSONL session
- do not print streamed output
- after execution, build the structured JSON envelope from the final session
- print the envelope once
- set exit code from the structured result/session error state
Current CLI validation has early console.error(...) branches for things like:
- invalid
--json - unknown defaults profile
- missing required arguments
- unexpected args
These must be updated so that when --output json is selected they emit a structured error envelope instead of plain text.
This is required to keep machine output clean and parseable.
For other output modes, existing behavior should remain unchanged.
Use consistent field naming across tools.
Preferred common field names:
appPathbundleIdprocessIdsimulatorIddeviceIdbuildLogPathruntimeLogPathosLogPath
Collections should use plural nouns:
simulatorsdevicesschemesprojectstestsentries
For open-ended map-like outputs, prefer deterministic entry arrays over unordered freeform objects where snapshot stability matters.
Example:
{
"entries": [
{ "key": "PRODUCT_NAME", "value": "CalculatorApp" }
]
}For dynamic xcode-ide style tools, do not block this work on designing perfect bespoke schemas for every dynamic result.
Use a generic normalized-content schema as a fallback for proxy-style tools, based on normalized event content such as:
- header
- detail trees
- tables
- sections
- file refs
- summary
This should still exclude next steps and raw event passthrough.
The doc needs concrete examples because the envelope by itself is not the hard part. The important part is what goes in data.
These are representative target shapes, not final frozen contracts.
Schema:
schema: "xcodebuildmcp.output.app-path"schemaVersion: "1"
{
"schema": "xcodebuildmcp.output.app-path",
"schemaVersion": "1",
"didError": false,
"error": null,
"data": {
"artifacts": {
"appPath": "~/Library/Developer/Xcode/DerivedData/.../Build/Products/Debug-iphonesimulator/CalculatorApp.app",
"simulatorId": "AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE"
}
}
}Notes:
- shared schema family with device/macOS app-path tools is fine
- this now follows the same
artifactsconvention as other durable outputs - if target identity is known and useful, keep it in
artifacts; otherwise omit it rather than inventing a new layout
Schema:
schema: "xcodebuildmcp.output.simulator-list"schemaVersion: "1"
{
"schema": "xcodebuildmcp.output.simulator-list",
"schemaVersion": "1",
"didError": false,
"error": null,
"data": {
"simulators": [
{
"name": "iPhone 16",
"simulatorId": "AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE",
"state": "Shutdown",
"isAvailable": true,
"runtime": "iOS 18.0"
},
{
"name": "iPhone 16 Pro",
"simulatorId": "FFFFFFFF-1111-2222-3333-444444444444",
"state": "Booted",
"isAvailable": true,
"runtime": "iOS 18.0"
}
]
}
}Notes:
- this is a good example of primary data replacing presentation-only text/grouping
- text mode can keep emojis and grouped display; structured JSON should just expose normalized data
- if the current event stream does not expose enough data directly, this is the kind of case where we may need a small tool-specific extractor against event payloads
Schema:
schema: "xcodebuildmcp.output.launch-result"schemaVersion: "1"
{
"schema": "xcodebuildmcp.output.launch-result",
"schemaVersion": "1",
"didError": false,
"error": null,
"data": {
"artifacts": {
"bundleId": "com.example.CalculatorApp",
"simulatorId": "AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE",
"processId": 12345
}
}
}Notes:
- this is a simple action-result shape
- it now uses the same
artifactsnesting as build-style results - snapshot helpers can still reuse these fields; they just read them from a consistent location
Schema:
schema: "xcodebuildmcp.output.build-settings"schemaVersion: "1"
{
"schema": "xcodebuildmcp.output.build-settings",
"schemaVersion": "1",
"didError": false,
"error": null,
"data": {
"entries": [
{ "key": "PRODUCT_NAME", "value": "CalculatorApp" },
{ "key": "PRODUCT_BUNDLE_IDENTIFIER", "value": "com.example.CalculatorApp" },
{ "key": "SDKROOT", "value": "iphonesimulator" }
]
}
}Notes:
- use ordered
entriesrather than a large freeform object for better snapshot stability - if we later decide consumers strongly prefer an object map, that would be a schema decision and versioning question, not something to drift into accidentally
Schema:
schema: "xcodebuildmcp.output.build-result"schemaVersion: "1"
{
"schema": "xcodebuildmcp.output.build-result",
"schemaVersion": "1",
"didError": false,
"error": null,
"data": {
"summary": {
"status": "SUCCEEDED",
"durationMs": 5234
},
"artifacts": {
"appPath": "~/Library/Developer/Xcode/DerivedData/.../Build/Products/Debug-iphonesimulator/CalculatorApp.app",
"buildLogPath": "~/Library/Logs/XcodeBuildMCP/build.log"
},
"diagnostics": {
"warnings": [],
"errors": []
}
}
}Notes:
- for build/test-style tools,
datashould focus on durable result data, not transient stage events - build stages belong in
jsonl, not structuredjson - this keeps the split between event stream and result document clear
{
"schema": "xcodebuildmcp.output.build-result",
"schemaVersion": "1",
"didError": true,
"error": "Build failed",
"data": {
"summary": {
"status": "FAILED",
"durationMs": 8123
},
"diagnostics": {
"warnings": [],
"errors": [
{
"message": "Cannot find 'FooBar' in scope",
"location": "Sources/App/ContentView.swift:42:13"
}
]
},
"artifacts": {
"buildLogPath": "~/Library/Logs/XcodeBuildMCP/build.log"
}
}
}Notes:
- failure does not have to force
datatonull - if we have durable structured failure data, we should keep it
errorstays the standardized top-level quick summary;data.diagnosticscarries the details
This is the case that currently goes through console.error(...) and needs special handling in structured mode.
{
"schema": "xcodebuildmcp.output.launch-result",
"schemaVersion": "1",
"didError": true,
"error": "Missing required argument: simulator-id",
"data": null
}Notes:
- this is why
--output jsonneeds custom early-error handling - machine consumers should still get one valid envelope even when the tool never started
Add unit coverage for:
- output mode wiring
jsonvsjsonldispatch- structured error envelope generation
- schema registry uniqueness
- envelope building behavior
- event filtering that excludes
next-steps - manifest coverage so every CLI-exposed tool has a structured-output definition
That manifest coverage test is important. It prevents future drift when new tools are added.
Keep the existing text snapshots intact.
Add a parallel fixture set for CLI structured JSON rather than replacing current fixtures.
Keep current fixtures:
src/snapshot-tests/__fixtures__/cli/.../*.txtsrc/snapshot-tests/__fixtures__/mcp/.../*.txt
Add new structured fixtures:
src/snapshot-tests/__fixtures__/cli-json/.../*.json
Extend the snapshot harness to accept an output mode:
- default remains
text - add
json - add
jsonlif we want explicit event-stream snapshot coverage later
Add JSON-aware normalization that:
- parses the envelope
- recursively normalizes paths, UUID-like values, and other unstable values
- re-serializes deterministically with 2-space indentation
Do not normalize structured JSON with regex over raw text.
Update snapshot parser helpers so they first check structured JSON fields, then fall back to existing text parsing.
For example:
extractAppPathFromSnapshotOutput()should first look atdata.appPathextractProcessIdFromSnapshotOutput()should first look atdata.processId
This is another reason to keep field naming consistent.
Update:
docs/CLI.mddocs/dev/RENDERING_PIPELINE.mdREADME.md- any examples or tests that currently describe
--output jsonas the event stream mode
Record this as a breaking CLI contract change:
--output jsonnow means structured JSON- previous event-stream behavior moves to
--output jsonl
docs/dev/RENDERING_PIPELINE_REFACTOR.md is stale and should be removed or replaced after this plan is implemented, so it does not keep misleading future work.
The main risk is not the CLI wiring. The main risk is extracting clean structured payloads for every tool from the current event model without introducing brittle special cases.
Mitigation:
- build the schema registry first
- add coverage tests so every tool must be mapped
- prefer shared schema families
- only add tool-specific extraction when needed
- avoid changing text output unless absolutely necessary
- do not do a broad renderer refactor
- do not change daemon protocol unless a real blocker appears
- do not change MCP response contracts for this work
- do not parse rendered text into JSON
- do not include next steps in structured JSON
- do not scatter schema logic across manifests and runtime code
- Rename internal render strategy
cli-json->cli-jsonl - Extend CLI output enum to include
jsonl - Update CLI help text and output selection wiring
- Preserve existing event-stream behavior under
jsonl
- Add
src/structured-output/module - Define envelope type and schema definition contract
- Add event indexing helpers
- Add registry lookup by tool id
- Add envelope/error-building helpers
- Implement workflow/family schema definitions
- Prefer shared contracts where sensible
- Add fallback normalized-content schema for proxy/dynamic tools
- Add a manifest coverage test to require complete tool coverage
- Wire
--output jsonto build a single envelope from the final session - Update early validation failures to emit structured error envelopes in
jsonmode - Ensure stdout remains clean machine output in structured mode
- Preserve exit code behavior
- Extend snapshot harness/contracts for output mode selection
- Add JSON normalization support
- Add
cli-jsonfixture tree - Add structured JSON snapshot coverage across CLI-capable suites
- Keep existing text fixtures unchanged
- Update CLI docs and README examples
- Update rendering pipeline documentation
- Add changelog entry
- Remove or replace stale output/refactor docs
Before handoff, run the relevant non-doc checks for the implementation work:
npm run typechecknpm run testnpm run test:snapshot- any targeted smoke coverage if output-mode CLI tests exist
For this planning-only change, no checks are required.
Treat structured JSON as a projection layer over the finished CLI session, not as a new renderer or a pipeline redesign.
That gives the cleanest maintainable system:
- one event production model
- one shared session capture model
- text output preserved
- jsonl output preserved
- structured json added with low regression risk
- versioned per-tool contracts that can evolve intentionally over time