Skip to content

MaxCode: Add MergeAgent for file discovery, import-graph analysis and merging#24

Open
gvanica wants to merge 1 commit intomainfrom
split/2-merge-agent
Open

MaxCode: Add MergeAgent for file discovery, import-graph analysis and merging#24
gvanica wants to merge 1 commit intomainfrom
split/2-merge-agent

Conversation

@gvanica
Copy link
Copy Markdown
Collaborator

@gvanica gvanica commented Apr 22, 2026

Summary

Introduces MergeAgent (agents/migration/merge_agent.py), a pure-logic agent (no LLM calls) that encapsulates the complete pre-conversion file preparation pipeline. This logic was previously embedded directly in the demo script step3_merge.py (~600 lines) and is now a reusable component that PrimaryAgent can call programmatically.

What MergeAgent does

  1. File discovery — Walks the repository tree and identifies Python model files, filtering out tests, setup scripts, and infrastructure
  2. Infrastructure filtering — Excludes non-model files (configs, tokenizers, CLI entry points) using pattern-based heuristics
  3. Import-graph analysis — Parses import / from ... import statements to build a dependency graph between files
  4. Topological sorting — Orders files so that dependencies appear before dependents in the merged output
  5. File merging — Concatenates files in dependency order, deduplicating imports and resolving cross-file references into a single merged_model.py and optional merged_utils.py

Files

File Description
agents/migration/merge_agent.py New — MergeAgent with MergeResult dataclass (740 lines)
examples/demo/step3_merge.py New — Thin CLI wrapper that calls MergeAgent.run() (127 lines)
examples/demo/merged_utils.py New — Shared utility functions extracted during merge (139 lines)

Design decisions

  • No LLM dependency — The agent uses only AST parsing and pattern matching, making it fast and deterministic
  • Separate model vs utility output — Produces both model_code and utility_code so the conversion agent can handle them separately
  • Dataclass resultMergeResult provides structured access to merged code, included file lists, and the dependency graph

Test plan

  • Run step3_merge.py on a cloned demo repo and verify merged output contains all model components
  • Verify topological ordering: no forward references in merged output
  • Verify infrastructure files (setup.py, configs) are excluded from merge

Split from #17 — PR 2 of 8

Introduces MergeAgent, a pure-logic agent (no LLM calls) that
encapsulates the file discovery, infrastructure filtering,
import-graph analysis, topological sorting, and file merging logic
previously embedded in the demo script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 22, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant