Skip to content

feat: add JSON output format via --format flag#4

Merged
ga1az merged 3 commits into
ga1az:mainfrom
Vader-7:feat/json-output
Apr 13, 2026
Merged

feat: add JSON output format via --format flag#4
ga1az merged 3 commits into
ga1az:mainfrom
Vader-7:feat/json-output

Conversation

@Vader-7
Copy link
Copy Markdown
Contributor

@Vader-7 Vader-7 commented Apr 13, 2026

Summary

Add structured JSON output format via --format json (-f json) flag, enabling programmatic consumption of pathdigest output by tools, CI pipelines, and LLM integrations.

Motivation

The README mentions JSON output as "in progress". This PR implements it. Structured JSON output is essential for:

  • MCP servers that need to parse project structure for LLM context
  • CI pipelines that want to analyze codebase structure
  • Editor integrations (Cursor, VSCode) consuming digest data
  • Any tool that needs to programmatically process the digest

Usage

# JSON to file
pathdigest ./my-project --format json -o digest.json

# JSON to stdout
pathdigest ./my-project -f json -o -

# Default (text) - unchanged behavior
pathdigest ./my-project

JSON Schema

{
  "summary": {
    "source": "/path/to/project",
    "total_files": 42,
    "total_size": 123456,
    "total_size_human": "120.6 KB",
    "exclude_patterns": ["node_modules/", ".git/"],
    "include_patterns": [],
    "max_file_size": 10485760
  },
  "tree": [
    {
      "name": "src",
      "path": "src",
      "type": "directory",
      "size": 4096,
      "children": [...]
    }
  ],
  "files": [
    {
      "path": "src/main.go",
      "size": 1234,
      "type": "file",
      "content": "package main\n..."
    }
  ],
  "git_info": {
    "repo_url": "https://github.com/user/repo.git",
    "branch": "main"
  }
}

Changes

  • internal/digest/json.go: New file with JSON types and FormatJSON method
  • cmd/root.go: Add --format / -f flag (default: text), route to JSON or text formatting

Test plan

  • All existing tests pass
  • go build ./cmd/pathdigest compiles successfully
  • pathdigest . -f json -o - produces valid JSON
  • pathdigest . -f text maintains backward-compatible text output
  • Git URL digests include git_info in JSON output

Add `--format json` flag to output structured JSON instead of plain text.
The JSON output includes:

- `summary`: source path, total files/size, patterns, max file size
- `tree`: full directory tree as nested objects with name, path, type, size
- `files`: flat array of all processed files with path, size, type, and
  content (when available)
- `git_info`: repository metadata when processing a Git URL

This enables programmatic consumption of pathdigest output by tools,
CI pipelines, and LLM integrations that need structured data.

Usage:
  pathdigest ./my-project --format json
  pathdigest ./my-project -f json -o digest.json

The default format remains "text" for backward compatibility.

Made-with: Cursor
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a machine-readable JSON output mode to pathdigest so external tools (CI, editor integrations, MCP/LLM tooling) can consume digests programmatically.

Changes:

  • Introduces JSON schema/types and (*Result).FormatJSON() to serialize summary/tree/files (+ optional git info).
  • Adds --format/-f flag (default text) and routes CLI output to either JSON or existing text formatting.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
internal/digest/json.go Adds JSON output structs + traversal/serialization helpers to emit summary/tree/files/git_info.
cmd/root.go Adds --format flag and switches CLI output logic between text and JSON.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/digest/json.go Outdated
Comment on lines +19 to +22
TotalSizeHuman string `json:"total_size_human"`
ExcludePatterns []string `json:"exclude_patterns,omitempty"`
IncludePatterns []string `json:"include_patterns,omitempty"`
MaxFileSize int64 `json:"max_file_size,omitempty"`
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include_patterns / exclude_patterns are tagged with omitempty, so the fields disappear when empty rather than being encoded as empty arrays. The PR description's schema shows these as arrays (possibly empty), so consider removing omitempty to keep the JSON schema stable for consumers.

Copilot uses AI. Check for mistakes.
Comment thread internal/digest/json.go Outdated
Comment on lines +123 to +126
if node.Content != "" {
f.Content = node.Content
}
*files = append(*files, f)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For NodeTypeFile, content is only set when node.Content != "", and the field is also tagged omitempty. This means empty files (and read failures that leave Content empty) produce file entries without a content field, making it ambiguous for JSON consumers; consider always emitting content for processed text files (or using a *string / explicit error indicator).

Copilot uses AI. Check for mistakes.
Comment thread internal/digest/json.go
Comment on lines +48 to +54
func (r *Result) FormatJSON(opts IngestionOptions) ([]byte, error) {
output := JSONOutput{
Summary: JSONSummary{
Source: opts.Source,
TotalFiles: r.TotalFiles,
TotalSize: r.TotalSize,
TotalSizeHuman: formatBytes(r.TotalSize),
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FormatJSON introduces a new output contract but there are no unit tests covering the JSON structure/fields. Since internal/digest already has tests, add tests that marshal/unmarshal the output and assert key fields (summary counts/sizes, tree paths, files content omission rules, git_info presence) to prevent breaking changes.

Copilot uses AI. Check for mistakes.
Comment thread cmd/root.go Outdated
fmt.Fprintf(os.Stderr, "Error formatting JSON output: %v\n", errJSON)
os.Exit(1)
}
outputContent = string(jsonBytes)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outputContent = string(jsonBytes) forces an extra copy of the entire JSON payload, and the file-write path later converts it back to []byte. For large outputs, keep JSON as []byte and write it directly to stdout/file to avoid these allocations.

Copilot uses AI. Check for mistakes.
Comment thread cmd/root.go Outdated
fmt.Fprintln(os.Stderr, "\n--- Summary ---")
fmt.Fprint(os.Stderr, ingestResult.Summary)
if outputFormat != "json" {
ingestResult.FormatOutput(opts)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FormatOutput is invoked twice in the text path (once to build outputContent, and again before printing the summary). This re-traverses the tree and rebuilds large strings unnecessarily; call it once and reuse ingestResult.Summary/TreeStructure/FileContents.

Suggested change
ingestResult.FormatOutput(opts)

Copilot uses AI. Check for mistakes.
Comment thread cmd/root.go Outdated
Comment on lines +98 to +100
ingestResult.FormatOutput(opts)
outputContent = ingestResult.TreeStructure + "\n" + ingestResult.FileContents
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In text mode outputContent = ingestResult.TreeStructure + "\n" + ingestResult.FileContents creates an additional full copy of the digest in memory. For large repos, prefer writing TreeStructure and FileContents directly to the selected writer (stdout/file) instead of concatenating.

Copilot uses AI. Check for mistakes.
Comment thread cmd/root.go Outdated
} else {
fmt.Println(ingestResult.TreeStructure)
fmt.Println(ingestResult.FileContents)
fmt.Println(outputContent)
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt.Println(outputContent) always appends an extra newline. For --format json this can be undesirable when piping to tools that expect exact JSON without trailing output; use fmt.Print/os.Stdout.Write to emit the JSON bytes as-is.

Suggested change
fmt.Println(outputContent)
fmt.Print(outputContent)

Copilot uses AI. Check for mistakes.
Comment thread cmd/root.go Outdated
Comment on lines +90 to +97
if outputFormat == "json" {
jsonBytes, errJSON := ingestResult.FormatJSON(opts)
if errJSON != nil {
fmt.Fprintf(os.Stderr, "Error formatting JSON output: %v\n", errJSON)
os.Exit(1)
}
outputContent = string(jsonBytes)
} else {
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--format values other than "json" silently fall back to text output. Consider validating the flag (accept only "text"/"json", optionally case-insensitive) and returning a clear error for unsupported values to avoid surprising CLI behavior.

Copilot uses AI. Check for mistakes.
ga1az added 2 commits April 13, 2026 15:17
- Validate --format flag (reject unsupported values like 'yaml')
- Remove duplicate FormatOutput call in text path
- Write JSON bytes directly to file/stdout (avoid string round-trip)
- Use fmt.Print for JSON stdout (no trailing newline from Println)
- Remove omitempty from exclude/include_patterns (stable JSON schema)
- Always include content field for files (eliminate empty file ambiguity)
- Extract writeOutputFile helper to reduce duplication
- Add comprehensive tests for FormatJSON (6 test cases)
@ga1az ga1az merged commit c73d535 into ga1az:main Apr 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants