feat: multi-language support for the encoder and decoder pipelines by HuYaSen · Pull Request #67 · microsoft/RPG-ZeroRepo

HuYaSen · 2026-06-15T04:31:12Z

Summary

This PR extends CoderMind's encoder (repository → RPG graph) and decoder (docs → repository) pipelines from Python-only to multi-language, adding support for Python, JavaScript, TypeScript, Go, C, C++, and Rust.

All language differences are isolated behind two abstraction layers:

lang_parser — a tree-sitter–based parser module giving the encoder uniform code-structure and dependency extraction.
decoder_lang backends — one backend per language encapsulating file layout, entry points, test commands, code structure, and build environment; every decoder stage (plan / codegen / verification) goes through these backends.

Encoder

New standalone lang_parser module (tree-sitter backend + 7 language parsers/configs) wired into the RPG / dep-graph / encoder pipeline.
Language metadata flows through refactor_tree NodeMetaData; rpg.json's meta.language reflects the real repo instead of always python.
Dominant-language detection; add_file and semantic re-runs dispatch per language; non-Python class-like units normalized.
rpg.json single source of truth (dep_graph embedded); atomic writes eliminate half-written files.
C/C++ calls resolve to the definition, not the header prototype; scan exclusions centralized and per-commit LLM exclusion dropped (perf).

Decoder

Seven language backends with a unified protocol: entry points, test commands, inheritance, code structure, build env, unit-kind.
Planning: language metadata flows through feature spec → build → tree → skeleton → interfaces → tasks; interface design/review works for every language with language-neutral prompts; non-Python files ordered by real imports, not Python AST.
Code generation: codegen / post-verify / global-review / subtree-review / final-test all route through the backend and use native test commands (go test, npm, cargo, ctest, …); non-Python projects no longer emit Python files or run venv/pip setup.
Verification: on-disk source fallback for language resolution; zero-test "no-op pass" rejected; bounded repair loop in final-test; unified git branch-name sanitization.
Entry coordination: <MAIN_ENTRY> and the skeleton entry are reconciled across all 7 languages to avoid a duplicate main.
Interface review (final fix): unit-level and feature-level orphan checks unified into a single shared predicate, removing spurious WARNs on framework roots/factories/callbacks; entry-point name matching normalized (RunMain ↔ function RunMain).

Adds a self-contained source-parsing module supporting Python, Go, TypeScript, JavaScript, C, C++, and Rust. The module is structurally isolated — no other code imports it yet; this commit only brings the module + its unit tests in. A follow-up commit will wire it into scripts/rpg/* and scripts/rpg_encoder/* so the encoder actually parses non-Python sources. Layout - scripts/lang_parser/ - base.py BaseLanguageParser ABC - models.py LanguageConfig / LPCodeUnit / LPDependency / LPFileResult - registry.py detect_language / parse_file / validate_syntax / is_supported_source / is_test_file / get_parser{,_for_file} / get_config{,_for_path} / markdown_fence_for_path - tree_sitter_backend.py Lazy wrapper around optional tree-sitter grammar packages (missing grammar only disables one language, never crashes import) - config/ One per language: extensions / test globs / tree-sitter language name / node-type vocab - <lang>_parser.py One per language; _c_family_parser.py and _ecmascript_parser.py hold shared logic - extractors/ Fallback unit extraction - tests/test_lang_parser_*.py 66 tests covering registry + each parser API contract - Public surface is locked via lang_parser/__init__.py's __all__. Callers must import from the top level (e.g. 'from lang_parser import parse_file'); reaching into lang_parser.python_parser etc. is reserved for tests. Dependencies (added to pyproject.toml) - tree-sitter-go / tree-sitter-typescript / tree-sitter-javascript / tree-sitter-c / tree-sitter-cpp / tree-sitter-rust. All grammars are loaded lazily through tree_sitter_backend; a user who only wants Python support can uninstall any subset and the rest of the module keeps working. Verification - 'pytest tests/test_lang_parser_*.py' -> 66 passed. - Module import smoke-test confirms all 15 public symbols load.

Builds on the standalone lang_parser module landed in the previous commit. The Python path is preserved unchanged everywhere; non-Python sources (Go, TypeScript / JavaScript, C / C++, Rust) now flow through parallel lang_parser branches and produce the same RPG node + edge shape downstream consumers already expect. rpg/models.py - NodeMetaData gains a 'language' field (Optional[str]) propagated by to_dict / from_dict. Legacy artefacts that omit the field load as language=None. rpg/dep_graph.py - import lang_parser at module level. - _exclude_irrelevant_for_parse now accepts any language registered with lang_parser and uses lang_parser.is_test_file for per-language test detection (replaces hardcoded '.py' + test_/_test.py heuristics). - DependencyGraph.__init__ adds a lazy Go module-path cache. - New class constants: _LP_CLASS_LIKE_UNIT_TYPES, _LP_NODE_TYPES, _TS_JS_IMPORT_EXTENSIONS, _C_FAMILY_EXTENSIONS, _RUST_EXTENSIONS. - parse() dispatches on lang_parser.detect_language: Python keeps the original ast path (unchanged), non-Python routes through _parse_lp_file_result + _parse_lp_invoke_dependencies. - ~27 new methods adapt LPFileResult into RPG nodes/edges and resolve cross-language dependencies: _parse_lp_file_result, _parse_lp_invoke_dependencies, _parse_lp_dependencies, _lp_* (node id / attrs / type helpers), _add_lp_* (edge helpers), _resolve_lp_* (cross-file import + invoke resolution), _resolve_{ecmascript,go,c,rust}_invoke, _is_c_family_import, _is_rust_import, _resolve_rust_* (mod/super/crate/path helpers), _resolve_go_module_import + _read_go_module_path, plus the import placeholder helpers (_ensure_lp_import_placeholder, ...). - Python file nodes are now also tagged with language='python' so downstream tooling can branch uniformly on the language attribute. rpg/code_unit.py - ParsedFile.__init__ tries lang_parser first (_parse_with_language_parser) and falls back to the original ast path. Python stays on the ast path for behavioural parity. _code_units_from_parser_result adapts LPFileResult units into CodeUnit, preserving language / line range in extra. - CodeUnit.lineno / end_lineno fall back to extra['line_start' / 'line_end'] when self.node is not an ast.AST (i.e. lang_parser units). - CodeSnippetBuilder.generate_code_snippet skips the ast.parse import / assignment scan when the language is not Python (a new _language_for_units helper detects via unit.extra['language'] or file_path). - CodeSnippetBuilder.build now emits the correct markdown fence per file via lang_parser.markdown_fence_for_path (ts -> typescript, etc). rpg_encoder/rpg_evolution.py - _filter_non_test_py_files keeps its historical name but accepts any lang_parser-supported source via is_supported_source + is_test_file. Diff-driven incremental update paths (added / deleted / modified file filters) use is_supported_source too. rpg_encoder/rpg_encoding.py - skeleton's valid_files filter uses is_supported_source AND not is_test_file. rpg_encoder/semantic_parsing.py - file enumeration uses is_supported_source AND not is_test_file. tests - tests/test_multilingual_code_unit.py: 4/4 pass - tests/test_multilingual_dep_graph.py: 5/6 pass - tests/test_multilingual_encoder_pipeline.py: 2/4 pass - tests/test_multilingual_prompt_safety.py: 0/3 pass Known failures (intentionally deferred to a follow-up PR): - test_multilingual_prompt_safety: requires editing scripts/rpg_encoder prompt strings to remove Python-only wording ('Python classes', '.py files', etc). Pure prose work, orthogonal to this change. - test_multilingual_encoder_pipeline::test_refactor_tree_assigns_*: requires propagating language metadata through scripts/rpg_encoder/ refactor_tree.py — the file has independently evolved on main and needs a separate audit to avoid clobbering its current behaviour. - test_multilingual_encoder_pipeline::test_go_repo_enters_semantic_*: exercises a deeper semantic_parsing code path (group_units + per-language prompts) that this PR does not touch. - test_multilingual_dep_graph::test_incremental_update_keeps_typescript_*: exercises the incremental-update edge-replay path which is orthogonal to the first-pass parse done here. Verification (rpgkit conda env on this checkout) - pytest tests/test_lang_parser_*.py -> 66 / 66 pass - pytest tests/test_rpg_models.py tests/test_rpg_encoding.py tests/test_e2e.py tests/test_integration.py -> no new failures (1 pre-existing failure on main; verified by git-stash bisection)

Self-review of the previous wire-in commit caught three places where the multi-language path was incomplete or inconsistent. None of these affect the Python-only happy path, but each silently degrades the multi-language story. dep_graph.py: reparse_ast() must mirror parse() dispatch Previously, after a from_dict round-trip, reparse_ast unconditionally ran ast.parse on every file node — which raises SyntaxError on Go / TS / C / Rust / ... and silently skipped them, leaving non-Python files with no 'ast' attribute and therefore no rebuilt import / invoke / inherit edges. Now it dispatches on lang_parser.detect_language exactly like parse() does and routes non-Python through _parse_lp_file_result + _parse_lp_invoke_dependencies. rpg_evolution.py: incremental-diff filters now also exclude tests The added / deleted / modified filters were using is_supported_source but had lost the 'and not is_supported_test_file' check during the wire-in. Restored parity with the old repo so the encoder doesn't treat newly added test files (e.g. main_test.go) as production code on incremental update. models.py: infer_type_name_from_path now recognises non-Python sources The fallback inference returned 'directory' for any path that didn't end in '.py', which would mislabel .go / .ts / .rs files as directories when metadata-driven NodeType inference is exercised. Delegates to lang_parser.is_supported_source with a graceful ImportError fallback so the legacy Python-only behaviour is preserved when lang_parser is not installed. Verification - syntax check on all three files: OK - pytest tests/test_lang_parser_*.py + tests/test_multilingual_*.py + tests/test_rpg_models.py + tests/test_e2e.py + tests/test_integration.py -> identical pass/fail to before this commit (no regression). The 6 deferred multilingual failures and 1 pre-existing main failure remain, unchanged.

…n ParsedFile semantic_parsing.py groups units by unit_type == 'class' / 'function' to drive its parse_classes / parse_functions LLM batches. Until now, lang_parser-produced kinds like Go 'struct', Rust 'enum'/'trait', TS/Go 'interface', C/C++ 'struct' were forwarded verbatim — so they fell into neither bucket and were silently dropped from the LLM input. Normalise these to 'class' inside _code_units_from_parser_result, keep the original kind in extra['lp_kind'] so RPG-side renderers can still recover it, and update the parity test to lock in the new contract. Verified end-to-end via tests/test_multilingual_encoder_pipeline.py:: test_go_repo_enters_semantic_parsing_with_non_empty_units (previously KeyError, now passes).

…MetaData RefactorTree now accepts a repo-wide 'language' default plus an optional 'language_map' (path-prefix -> language) and stamps each NodeMetaData it produces (FILE / CLASS / FUNCTION / METHOD) with the resolved language so downstream tooling can branch on node.meta.language instead of guessing from extension. - __init__: add 'language' (default 'python') and 'language_map' kwargs; normalise prefixes via _normalise_lang_prefix. - _resolve_language(path): longest-prefix match against language_map, falls back to self.language; returns the default for None/empty. - 8 NodeMetaData sites in run() (instance path) and refactor_new_files() (classmethod path) now pass language=self._resolve_language(file_path) / instance._resolve_language(file_path). Verified by tests/test_multilingual_encoder_pipeline.py:: test_refactor_tree_assigns_language_metadata_to_go_and_typescript_nodes which now passes; previously asserted None for every type.

…mpts The parse/encoding prompts predate the multi-language wire-in and talked exclusively about Python: 'Python classes', 'Python repository', '__init__/__new__/__repr__' as canonical examples, '.py only' scope clauses, and Python-stack typing examples ('pandas.DataFrame', 'pyarrow.Table'). For Go / TS / C / Rust repos these phrasings bias the LLM toward Python idioms and at worst tell it to ignore non-Python files entirely. - parse_prompts.PARSE_CLASS: 'classes' -> 'class-like constructs (classes, structs, interfaces, traits, enums, ...)'; drop __init__/__new__/__repr__ examples; rewrite DataLoader example to use a language-neutral 'new_loader' method name. - parse_prompts.PARSE_FUNCTION: 'standalone Python functions' -> 'standalone (module-level) functions across any supported language'. - encoding_prompts.EXCLUDE_FILES: drop the '.py only' scope block; switch to 'source files in supported languages'. - encoding_prompts.ANALYZE_DATA_FLOW: 'Python repository' -> 'source repository'; replace pandas/pyarrow type examples with generic 'UserRecord' / 'User'. - tests/test_multilingual_prompt_safety.py: update the schema assertions to match the richer {feature: description} payload the prompts actually emit (the legacy [feature1, feature2] array shape was already gone before this scrub).

Two related holes in the incremental dep_graph update path that silently dropped all non-Python semantic edges: 1. add_file() always ran ast.parse on the file content. For Go / TS / C / Rust this raised SyntaxError, was logged at DEBUG, and the file node was kept with NO units, NO ast attr, and NO language attr. update_files() therefore had nothing to feed into the downstream semantic passes for any modified non-Python file. 2. _rerun_semantic_passes() only walked nodes with an ast attr. Even after fix (1) populates a language attr, the import/inherit/invoke passes were ast.Module-specific and produced zero edges for the newly-parsed lang_parser files. Fixes: - add_file(): detect_language(nid) first; Python keeps the original ast path, anything else routes through lang_parser.parse_file + _parse_lp_file_result (matching :meth:). Invoke resolution is deliberately deferred to _rerun_semantic_passes so cross-file calls see the final unit registry. - _rerun_semantic_passes(): after the ast-based passes, walk every FILE node whose language is non-None and non-Python, re-run lang_parser.parse_file using the cached 'code' attr, replay _parse_lp_file_result, then run _parse_lp_invoke_dependencies in a second pass so cross-file invoke targets resolve. Verified by tests/test_multilingual_dep_graph.py:: test_incremental_update_keeps_typescript_import_edges which now passes; previously the post-update graph had no app.ts:run node and no imports edge.

…ects the repo, not always 'python' Found via codermind-bench: every multi-language repo (Go/Rust/TS/...) had rpg.json feature nodes stamped with meta.language='python' even though dep_graph.json correctly reported the real language. Root cause was RefactorTree's language default ('python') silently winning because RPGParser never passed one. Fix: - rpg_encoding.RPGParser now derives the dominant language from self.valid_files via a new module-level helper _dominant_language() (uses lang_parser.registry.detect_language; ignores files with unknown/unsupported extensions; returns None only if every file is unknown). The result is passed to RefactorTree as language=. The log line surfaces the choice. - refactor_tree.RefactorTree: language kwarg now Optional[str]=None (was 'python'), self.language typed Optional[str], _resolve_language() returns Optional[str]. Callers that never set a language now get meta.language=None instead of a fabricated value. Verified by re-running cobra (Go): rpg.json now shows {'go': 207} where it previously showed {'python': 205}. All 96 lang_parser / multilingual tests still pass; full suite shows identical 30 pre-existing flakes as origin/main, +96 new passes, 0 new failures.

…factor classmethods Audit follow-up to 5f9ddf6. RefactorTree.refactor_new_files() and refactor_modified_files() (used by the post-merge / incremental update path in rpg_evolution) also instantiated 'cls(...)' without language=, so files added or touched after the initial encode would land with meta.language=None even though the same repo's initial-encode rpg.json had it correct. - lang_parser.registry: extract _dominant_language() into a public dominant_language(paths) helper (accepts any iterable; deterministic tie-break). Re-export via lang_parser package. - rpg_encoding: switch RPGParser.parse_rpg_from_repo() to the shared helper (drop the local _dominant_language). - refactor_tree: refactor_new_files() and refactor_modified_files() now compute the dominant language from parsed_tree.keys() and pass it to cls(language=...). Log line surfaces the choice when known. Net change vs upstream: 96 lang_parser + multilingual tests pass; full suite shows identical 30 pre-existing flakes as origin/main with +96 net passes.

Introduce a LanguageBackend strategy interface that lets the decoder pipeline (skeleton / func_design / code_gen) treat the target programming language as a parameter rather than the historical hard-coded .py / stdlib ast / pytest assumptions. Behavioural invariants: * All existing Python pipelines are byte-equivalent -- full decoder test suite holds at 1055 passed / 30 pre-existing flakes for every intermediate phase as well as the final state. * Encoder bench on cobra (Go) passes end-to-end; new package imports cleanly in subprocess env without polluting encoder behaviour. Phase 0 -- abstraction layer (scripts/decoder_lang/) backend.py : LanguageBackend Protocol + registry + ToolchainUnavailable prompt_hints.py : PromptHints dataclass test_result.py : TestRunResult / TestFailure / EnvHandle dataclasses python_backend.py : Behaviour-preserving Python implementation Trial wiring: code_gen/static_checks.py replaces a single ``suffix == '.py'`` literal with ``backend.is_source_file(path)``, proving the abstraction works without changing behaviour. Phase 1 -- target_language propagation feature/schemas/spec.py: optional ``target_language`` field on FeatureSpecOutput (defaults to None; legacy artefacts load unchanged). decoder_lang/backend.py: ``resolve_decoder_language`` with a 4-tier fallback chain (feature_spec -> RPG root meta -> dominant_language -> python with WARNING). skeleton/file_designer.py: FileDesigner.__init__ accepts ``target_language``, stores resolved backend on ``self.backend``. Phase 2 -- skeleton multi-language + GoBackend skeleton subset decoder_lang/go_backend.py: skeleton-relevant subset (file extension, identifier rules, package marker = None, prompt hints); AST/test methods raise NotImplementedError until Phase 3/4. skeleton/skeleton_models.py: ``add_init_files`` accepts optional ``backend`` parameter; backends whose ``package_marker_filename`` returns None turn it into a no-op (Go/Rust/TS). skeleton/file_designer.py: three ``misc.py`` literals replaced with ``misc{self.backend.file_extension}``; ``validate_directory_structure`` accepts a backend so Go path segments are checked against Go naming rules (reject hyphens, keywords like ``func``). Phase 3 -- func_design migrates from stdlib ast to lang_parser via backend Gap analysis showed lang_parser's python_parser already stuffs the raw ast node into ``LPCodeUnit.extra['ast_node']``, so no schema change is needed. decoder_lang/backend.py: adds ``list_code_units``, ``format_signature``, ``list_imports`` to the Protocol. decoder_lang/python_backend.py: implements all three plus ``find_main_block_lineno`` (Python-only hook, feature-detected via getattr). decoder_lang/go_backend.py: stubs the three Protocol methods. func_design/interface_agent.py: ~22 ast call-sites migrated to ``backend.*``. Python-specific helpers (``_extract_name_from_node``, ``_extract_type_names``, ``ast.get_docstring`` for docstring inspection) stay because they are intrinsically Python-AST-shaped. func_design/interface_review.py: ``_insert_unit_into_file_code`` routes through ``backend.find_main_block_lineno``; ``import ast`` removed. func_design/base_class_agent.py: unused ``import ast`` removed. Phase 4a -- code_gen helpers route through backend (scoped subset) code_gen/prompts.py: class+method walker uses backend.list_code_units; ``import ast as _ast_mod`` removed. code_gen/batch_prompts.py: signature summary uses backend.list_code_units + raw ast node (preserves param/return rendering byte-for-byte). code_gen/test_runner.py: import scanner uses backend.list_imports; ``import ast`` removed. Deliberately NOT migrated (Python-AST intrinsic, would force-fit wrong abstraction): static_checks.py body-shape detection, context_collector.py SQLAlchemy ORM heuristics, rpg_updater.py call-site walker. Phase 5 -- language directive preamble for LLM prompts decoder_lang/prompt_directive.py: ``language_directive(backend)`` returns an empty string for Python (zero-impact) and a 4-line preamble (display name, style directive, fence reminder, framework hint) for other languages. skeleton/file_designer.py: all three LLM call sites wrap the system prompt with ``with_language_directive(prompt, self.backend)``. Bulk substitution of literal "Python" / ".py" / "pytest" across the prompt body files is deferred until a real Go decoder run surfaces concrete failures to fix. Test coverage 101 new tests in decoder_lang/tests/ (Phases 0/1/2/3/5). Full decoder regression suite: 1055 passed / 30 pre-existing flakes (identical to dev/lang-parse-module-on-main baseline) at every intermediate and final state. Planning notes (gitignored, under CoderMind/plans/) decoder_multilang.md : architecture + 6-phase plan decoder_multilang_phase3_gap_analysis.md decoder_multilang_test_runbook.md

…f-files) Route the five core json write sites in the encoder + post-commit hook through ``common.rpg_io.atomic_write_rpg`` so a process kill in the middle of serialisation can never leave a truncated file behind. Bench mid-run kills (e.g. ``./bench encode-repos`` while a worker is mid-encode) used to corrupt the cache and required manual cleanup; with this change the original file always survives intact and the ``.tmp`` is removed automatically. Failure modes the new path handles vs the old ``open("w") + json.dump`` pattern: * SIGKILL mid-write : old file intact, .tmp orphaned * Serialiser TypeError : old file intact, .tmp cleaned up * Disk full mid-flush : old file intact, .tmp cleaned up * fsync failure : exception raised, no rename * Successful path : .tmp renamed atomically over target To support the encoder rounds that previously needed ``default=`` for non-serialisable objects (NodeMetaData etc.), ``atomic_write_rpg`` now forwards ``**dump_kwargs`` to ``json.dump`` so callers migrate without losing custom serialiser hooks. Sites migrated: CoderMind/scripts/rpg/models.py save_json : every RPGService.save call site save_dep_graph : post-commit hook + initial encode CoderMind/scripts/rpg_encoder/run_encode.py initial rpg.json write initial dep_graph.json write CoderMind/scripts/rpg_encoder/rpg_encoding.py two intermediate rpg.json writes (use the new dump_kwargs path for ``default=`` round-tripping of encoder-internal objects) CoderMind/scripts/update_graphs.py post-commit hook dep_graph.json refresh Tests: CoderMind/tests/test_rpg_io.py (+2) test_forwards_dump_kwargs : kwarg propagation test_no_partial_file_on_serialise_failure Verifies the cleanup contract the bench kill case kept hitting. Regression: 1057 passed / 30 pre-existing flakes (1055 baseline + 2 new atomic-write tests).

Follow-up to ac853f7. Audit found four additional ``json.dump`` sites that serialise RPG-derived payloads but were missed in the first sweep — same half-file risk, same one-line fix using ``common.rpg_io.atomic_write_rpg``. Sites migrated: CoderMind/scripts/rpg_encoder/run_update_rpg.py ``cmind update-rpg`` main rpg.json write. The previous pass covered run_encode.py / rpg_encoding.py but the incremental update path used its own bare ``open("w") + json.dump`` and so the half-file failure mode survived for users running ``cmind update-rpg`` after the encoder's initial run. CoderMind/scripts/rpg_encoder/version_control.py ``RPGVersionControl.save_version``. The version history file embeds ``rpg.to_dict()``; a killed save left a half-written rpg.vN.json that ``rollback(version=N)`` then failed to parse. ``RPGVersionControl.rollback`` was already using ``atomic_write_rpg`` for the symmetric main-file write since ac853f7, so this commit aligns the save side with the load side. CoderMind/scripts/rpg_encoder/rpg_evolution.py ``RPGEvolution.process_diff`` save_path block. Embeds ``rpg.to_dict()`` as ``rpg.structure`` plus a feature_tree and diff summary; mid-write kill left a truncated diff artefact that ``cmind diff`` / debug tools failed to read on next load. CoderMind/scripts/code_gen/stage_io.py ``save_stage_result`` for ``codegen_<name>.json`` sidecars (final_test / smoke_test / global_review). global_review loads all three as context; a killed earlier stage used to leave a truncated sidecar that surfaced as a JSONDecodeError later in the pipeline (looked like a global_review bug). Uses the ``default=str`` forwarding extension to ``atomic_write_rpg`` added in ac853f7 to preserve the original Path / datetime fallback serialiser without behaviour change. Tests: No new tests added — the contract (no partial file on kill or serialise error) is already covered by ``test_no_partial_file_on_serialise_failure`` in CoderMind/tests/test_rpg_io.py from ac853f7. Existing test suites for the migrated sites (``test_workflow_integration::TestRPGVersionControl``, ``test_rpg_evolution``) continue to pass. Regression: same 29 pre-existing flakes as before this commit (test_initial_encode_prompt: log location mismatch with ~/.cmind workspace dir; test_step3_polish: hook install API drift; test_rpg_evolution::test_update_dep_graph_index_no_crash: unrelated to atomic write). 991 passed unchanged. Bench smoke batch (5 repos, parallel 3) re-runs with all-PASS verdicts: chalk 289s / cobra 488s / requests 550s / sds 621s.

The dep_graph has been duplicated for years: ``run_encode.py`` wrote both ``rpg.json`` (which embeds it via ``RPG.to_dict(include_dep_graph =True)``) AND a standalone ``dep_graph.json``. The post-commit hook then refreshed only the standalone file, so ``RPGService.load`` had to override the embedded copy with the external one on every read. Two consequences: * ~650 KB of duplicated bytes per encode on disk. * Subtle drift: ``RPG.load_json`` (encoder side) returned the embedded copy while ``RPGService.load`` (decoder side) silently overrode it with whatever the hook had last persisted. Bench runs eventually surfaced this through "rpg.json says X, dep_graph.json says Y" diagnostics. This commit makes ``rpg.json`` the single source of truth: * Encoder no longer writes a standalone ``dep_graph.json``; the in-memory ``rpg.dep_graph`` is embedded by ``RPG.to_dict`` and persisted via the existing atomic ``rpg.json`` write. * All downstream writers (``update_graphs.py`` modes, ``run_batch`` codegen refresh, ``rpg_encoder.run_update_rpg``, ``RPGEvolution`` process_diff path) stop passing ``save_path`` for the dep_graph; they rely on ``svc.save(rpg_path)`` to roll the dep_graph into rpg.json. * ``RPGService.load`` flips its read order: embedded first, legacy external ``_dep_graph_file`` only as a backward-compat fallback (logs an INFO when used so silent migrations remain visible). * All known reader sites either already preferred embedded (``rpg_visualize.py::load_rpg``, ``rpg_edit/validate.py``) or now do (``update_graphs.py::update_feature`` mode). * ``RPG.save_dep_graph`` / ``RPG.load_dep_graph`` keep their public signature: callers that still want a standalone snapshot for debugging (and the legacy CLI ``--dep-graph`` flags) keep working. * ``--dep-graph`` CLI flags stay so legacy workspaces with an existing ``dep_graph.json`` continue to load via the compat path. Sites touched (10 files): CoderMind/scripts/rpg_encoder/run_encode.py Delete the 40-LOC standalone dep_graph.json write block. dep_graph parses + embeds via rpg.to_dict() as before; no second-file write. CoderMind/scripts/rpg/service.py - ``RPGService.load``: prefer embedded dep_graph over external ``_dep_graph_file``; INFO log on the legacy fall-through. - ``sync_from_commit_diff``, ``sync_from_file_list``, ``_apply_incremental_dep_graph_update``: ``save_path`` becomes ``Optional[...]`` (default ``None``) so callers can rely on the embedded copy. CoderMind/scripts/update_graphs.py All 6 modes (dep / mapping / feature / full / enrich / sync) drop the standalone write. ``dep`` mode prefers rpg.json (only falls back to writing dep_graph.json when no rpg.json exists yet — the very-first pre-commit hook case). All writers clear ``_dep_graph_file`` so RPGService.load doesn't fall through on the next read. CoderMind/scripts/run_batch.py ``_refresh_dep_graph_safe`` (codegen path) stops passing save_path. dep_graph rides inside rpg.json via the subsequent ``svc.save(rpg_path)``. CoderMind/scripts/rpg_encoder/rpg_evolution.py ``_update_dep_graph_index``: legacy "may be stale" WARNING downgraded to INFO "embeds into rpg.json on save" — the new default path. CoderMind/scripts/rpg_encoder/run_update_rpg.py Stops passing ``dep_graph_save_path`` to ``process_diff``. CoderMind/scripts/rpg_edit/validate.py Already preferred embedded; updated error text to mention ``/cmind.encode`` (which produces embedded) rather than the obsolete ``cmind script update_graphs.py sync`` standalone path. CoderMind/scripts/rpg/models.py Deprecation note on ``save_dep_graph`` docstring. CoderMind/scripts/common/paths.py Deprecation note on ``DEP_GRAPH_FILE`` constant. Tests: CoderMind/tests/test_step4_integration.py - ``test_update_dep_graph_index_writes_dep_graph_json`` → ``test_update_dep_graph_index_populates_in_memory_dep_graph`` (asserts in-memory + embedded round-trip via to_dict). - ``test_update_dep_graph_index_save_path_outside_rpg_dir`` → ``test_update_dep_graph_index_legacy_save_path_still_writes_standalone`` (verifies the legacy compat path still works for callers that do pass save_path). - ``test_update_dep_graph_index_without_save_path_logs_warning`` → ``test_update_dep_graph_index_without_save_path_logs_info`` (the WARNING→INFO downgrade — new default path is no longer a degraded mode). - ``test_process_diff_threads_dep_graph_save_path`` → ``test_process_diff_embeds_dep_graph_into_rpg``. - ``test_run_update_rpg_advances_meta_git_and_runs_align``: assertion flipped from "dep_graph.json exists on disk" to "rpg.json has embedded ``dep_graph`` field". Regression: 990 → 991 passed (the 1-test delta was the renamed ``test_update_dep_graph_index_without_save_path_logs_info`` taking the slot of the obsolete WARNING test). 29 pre-existing flakes unchanged. Manual verification: * Loaded all 5 real bench rpg.json artefacts (chalk / cobra / requests / sds / aho-corasick) via ``RPGService.load`` — all reconstruct dep_graph from the embedded copy with the expected node/edge/mapping counts. * Simulated new-encoder output by stripping ``_dep_graph_file`` and the standalone ``dep_graph.json`` from a tmpdir copy of cobra's rpg.json; ``RPGService.load`` still loaded 438 dep_nodes / 250 mappings from the embedded copy. Re-save preserved embed and cleared the legacy pointer field.

Replace migration-phase labels and archaeological comments with present-tense descriptions of the current behaviour. The cleanup covers runtime code, decoder-language tests, and workflow comments so source files no longer refer to implementation phases such as `Phase 3` or to past parser paths that only make sense when reading the original plan. The remaining `previously` matches are prompt text that describes user state across LLM interactions, not source-code history. Those prompts need the temporal wording to preserve their semantics. Tests: CoderMind/scripts/decoder_lang/tests Verify backend registry, language resolution, skeleton wiring, code-structure helpers, and prompt directives still behave as before. CoderMind/tests/test_rpg_io.py CoderMind/tests/test_step4_integration.py CoderMind/tests/test_workflow_integration.py Verify the adjacent encoder and RPG IO paths touched by the latest refactors still pass. Regression: 184 passed / 60 subtests passed.

Implement Go code-unit discovery, import listing, syntax checks, environment detection, test commands, dependency install commands, and go test output parsing behind the existing LanguageBackend contract. Update decoder_lang tests to cover the Go backend behavior while keeping Python default package-marker behavior explicit.

Promote list leaves through a small normalizer so string leaves, named dict leaves, and single-key dict leaves can become branch nodes without using unhashable dicts as keys. Add regression coverage for dict leaves emitted by feature expansion.

Restore compatibility for direct encode checks, mocked edge statistics, dep-graph code metadata, initial encode progress parsing, and home-side log assertions. Update tests to match the current post-commit/post-merge dispatcher contract and canonical double-colon RPG paths.

Keep decoder_lang package documentation aligned with the implemented Go backend capabilities.

Add target_languages alongside target_language, infer missing language hints from requirement docs, and propagate the primary language through feature_build, feature_refactor, RPG construction, and build_skeleton. This keeps existing scalar consumers compatible while allowing multi-language specs to carry an ordered language list.

Move decoder language metadata into the feature artifact meta block so generated outputs have one canonical place for primary and target languages. Thread that metadata through feature construction, skeleton planning, data flow, base class design, interface design, RPG creation, and task planning. Add focused coverage for Go plan-stage behavior and tighten feature_construct checks so missing or inconsistent language metadata is reported early.

Register Rust and TypeScript decoder backends so planning stages use target-language parser, prompt, and validation behavior instead of falling back to Python. Add language-specific project tasks for Rust and TypeScript and make plan warnings fail the pipeline so partial interface coverage cannot be reported as a successful plan.

Track generated interface coverage in the interface orchestrator and return a failing status when files or skeleton features remain uncovered. Add backend-owned project task templates so future language support can provide dependency, entrypoint, and README prompts from the backend instead of requiring new plan_tasks branches.

Reuse already-complete interface subtrees when plan reruns from a warning state. This prevents design_interfaces from regenerating successful prefixes and lets strict plan resumes continue from the first incomplete subtree instead of repeating long LLM calls.

Print subtree restore and coverage progress during design_interfaces so long plan runs can be tracked from stdout without digging into trajectory files.

Remove Python-only pass/docstring wording from interface prompts and skip the Python-specific global interface review for non-Python targets. This keeps Rust and TypeScript interface design focused on target-language declaration stubs instead of Python repair behavior.

Remove the outer whole-subtree retry from interface generation. The inner per-subtree loop already retries missing files and features; repeating the entire subtree can turn small multilingual plan fixtures into long-running loops before surfacing incomplete coverage.

Accept TypeScript declare function/class/type/interface/enum declarations during interface validation. This prevents valid declaration-only interface snippets from being rejected as having no target-language declarations.

Use backend-safe docstring detection for Python interface validation and avoid injecting Python import conventions into non-Python interface prompts. Reconcile global review counts after orphan retain/prune decisions so retained Python entry points no longer leave stale orphan counts that make bench report WARN.

Normalize full Markdown code fences before interface syntax validation. This lets Go, Rust, TypeScript, and Python validators accept common LLM outputs where the code field contains a fenced source block.

Strip TypeScript comments before parser-backed syntax and code-unit extraction so valid declaration snippets with JSDoc text are not rejected. This keeps original interface code intact while making validation tolerant of common documentation comments.

Copilot

Pull request overview

Copilot reviewed 127 out of 127 changed files in this pull request and generated 3 comments.

The final-test and smoke-test repair agents told the sub-agent to verify with a hardcoded pytest command, but final_test now runs the resolved backend's suite (Go/Rust/TS/JS as well as Python). Build the verify command from the backend so the repair agent uses the project's real test tool instead of pytest on a non-Python repo. Also fix inconsistent indentation in rpg_visualize.load_rpg.

Copilot

Pull request overview

Copilot reviewed 127 out of 127 changed files in this pull request and generated 5 comments.

- smoke_test: locate the real entry via entry_point_candidates globs so Go's cmd/<name>/main.go is probed instead of silently skipped - lang_parser: preserve Windows drive letters in path normalization - final_validation: distinguish toolchain-unavailable from zero-collected test runs in the no-op guard - skeleton_models: drop dead marker_default_body variable - rpg/service: correct sync_from_file_list docstring (any language, not .py)

Copilot

Pull request overview

Copilot reviewed 127 out of 127 changed files in this pull request and generated 3 comments.

- feature/spec: drop bare-word `\bgo\b` from language inference so English prose ("go to ...") no longer misdetects Go and picks the wrong backend; keep only unambiguous signals (golang/go.mod/go test|run|build/go language/go project) - parse_prompts: use a real constructor name (__init__) in the PARSE_CLASS example instead of the synthetic `new_loader` - pyproject: the tree-sitter grammars are mandatory deps; reword the comment so it no longer calls them optional

Commit b8b9bef changed the example method key to "__init__" on review feedback, but that reintroduces a Python-specific constructor name into an encoder prompt that must stay language-neutral (it parses Go/Rust/C/ C++/TS/JS too). "__init__" is on the forbidden-term list asserted by test_multilingual_prompt_safety. Use the real, language-neutral method name "configure" instead, which still satisfies the "use a real method name" review point without biasing the model toward Python.

Normalize dep-graph code-unit IDs through the canonical RPG path converter so non-Python nodes like `src/store.go:Load` can match `src/store.go::Load`. Run exact/canonical and suffix code-unit matching before falling back to the parent file node, preventing function/class/method dep nodes from being coarsely mapped to files when dedicated RPG nodes exist. Add regression coverage for Python, Go, and TypeScript code-unit mappings.

Update the initial encode progress parser to match the current encoder log message, `Total valid source files to parse`, instead of the old Python-only wording. Adjust progress parser tests and the mocked initial encode stderr to use the current source-file message.

Update the C++ backend's CMake test command to run `ctest` with `--test-dir <repo>/build`, matching the existing out-of-source `cmake -S <repo> -B <repo>/build` reconfigure step. Also update the C++ fallback test command shown in codegen prompts so agents use the same CTest invocation. Add regression coverage for CMake-based C++ test command construction.

Extend related test discovery beyond Python using conservative conventions for Go, Rust, JavaScript/TypeScript, C, and C++ test files. Keep Python's existing scoped pytest behavior, but continue running project-level verification for non-Python backends so file paths are not misused as backend-specific test selectors. Update post-verification logging to report related test hints separately from the actual project test command, and add regression coverage for multi-language discovery plus non-Python full-suite safety.

Use lang_parser's language-specific test-file rules for supported source files during dependency-graph build filtering, while preserving the legacy test-ish filter for non-source files. This prevents valid non-Python sources such as Go testdata fixtures or TypeScript test helper modules from being excluded before parse-time filtering can see them. Add regression coverage for Go testdata, non-Python helper source files, language-specific test files, and legacy non-source test-ish exclusions.

Allow codegen backend resolution to use on-disk source scanning as a fallback when feature_spec and RPG metadata do not declare a target language. Pass the repo path through prompt-building call sites so API summaries and TDD prompts resolve the same backend, while preserving explicit feature_spec/RPG language metadata precedence. Add regression coverage for metadata-free Go repos and explicit metadata overriding source-scan fallback.

Add backend-owned file dependency edge resolution for supported languages and refactor plan_tasks.py to sort files per inferred backend instead of using a global primary-language import heuristic. Preserve cross-language ordering conservatively for mixed-language subtrees. Move non-Python project task guidance fully into backends, update Go entry-point requirements, and clarify that warning states are cross-stage contract violations that trigger rebuilds.

- Exclude skip directories themselves from dep graph scans - Defer non-Python semantic edges during add_file - Preserve Go single-line function body invokes - Resolve TypeScript namespace import member calls

- Treat feature-free subtrees as complete when saved file blocks exist - Improve non-Python signature summaries with name-only and declaration fallback - Propagate target language through FuncDesigner interface phase - Add targeted tests for resume, summaries, and language propagation

…overage (#82) ## Summary Follow-up to #67. This PR hardens the multilingual decoder/codegen pipeline by tightening interface completeness checks, generated-artifact hygiene, C/C++ verification semantics, and planner prompt safety. It also restores the extracted multilingual regression tests that were moved out of the original #67 branch for a follow-up PR. ## What changed ### Decoder and verification hardening - Build C/C++ CMake targets before running `ctest`, so verification does not falsely pass or fail because test executables were never built. - Treat C/C++ `make test` targets that only compile objects, without actually running tests, as verification errors instead of successful test runs. - Skip generated/build/cache directories when collecting C/C++ source files for syntax and verification commands. - Improve C/C++ prompt rules to: - avoid editing build/cache/generated artifacts, - use full CMake build + CTest commands, - avoid relying on undeclared/transitively included helper functions, - report explicit syntax-check summaries. ### Generated artifact hygiene - Add a shared generated-artifact classifier and prompt rule helper. - Install local `.git/info/exclude` hygiene rules during batch startup. - Reject persisted generated artifacts before post-verify and after verification runs. - Prevent batch branches containing generated artifacts from being merged. ### Interface and planner robustness - Deduplicate repeated whole-file `file_code` blocks in `interfaces.json` before serialization and before planner prompt construction. - Add interface coverage validation so `plan_tasks` fails fast when `interfaces.json` does not cover all skeleton features. - Improve Python dependency collection for same-file calls and `self.method()` invocations. - Save in-progress interface generation to `interfaces.json.partial` and only overwrite the canonical `interfaces.json` after successful completion. - Allow interface review additions to scaffold missing file entries under existing feature subtrees. ### Parser and language detection fixes - Classify header-heavy mixed C/C++ repositories as C++ when C and C++ votes appear together. - Harden fallback string-literal stripping against catastrophic regex backtracking on unterminated escaped strings. ### Final validation behavior - Propagate smoke-test failures into the final validation result instead of allowing a successful unit-test result to mask a failed smoke check. - Clarify that `plan --check-only` warning states are not complete/done states and should not allow downstream stages to proceed. ## Test coverage This PR adds extensive regression coverage for the multilingual pipeline, including: - generated artifact hygiene, - interface source deduplication, - skeleton/interface coverage validation, - multilingual dependency graph behavior, - multilingual encoder/codegen behavior, - planner language support and prompt deduplication, - C/C++/Go/Rust/TypeScript/JavaScript/Python parser behavior, - decoder language backends and planning phases, - zero-test guard behavior, - final test repair, - repo language resolution, - orphan/test/build exclusion handling, - smoke multilingual coverage. The diff adds 33 new test files and restores the extracted multilingual tests intended for this follow-up PR. ## Notes for reviewers - This branch was rebased on top of the squash-merged #67 commit, so the PR diff should now represent only the follow-up hardening and restored tests. - `run_batch` / `post_verify` now update `.git/info/exclude` with local generated-artifact exclusions. This is intentionally local-only and non-destructive. - `plan_tasks` now fails on incomplete interface coverage instead of silently planning from stale or partial interfaces. - C/C++ projects whose `make test` target only compiles objects but does not execute tests will now be rejected as invalid verification results.

HuYaSen added 30 commits June 4, 2026 13:20

docs(decoder): Update Go backend package description

758ac74

Keep decoder_lang package documentation aligned with the implemented Go backend capabilities.

chore(decoder): Print interface generation progress

5fe3b2f

Print subtree restore and coverage progress during design_interfaces so long plan runs can be tracked from stdout without digging into trajectory files.

fix(decoder): Recognize TypeScript declaration stubs

5632740

Accept TypeScript declare function/class/type/interface/enum declarations during interface validation. This prevents valid declaration-only interface snippets from being rejected as having no target-language declarations.

fix(decoder): Strip fenced interface snippets

e5f2648

Normalize full Markdown code fences before interface syntax validation. This lets Go, Rust, TypeScript, and Python validators accept common LLM outputs where the code field contains a fenced source block.

Copilot AI reviewed Jun 15, 2026

View reviewed changes

Comment thread CoderMind/scripts/rpg_visualize.py Outdated

Comment thread CoderMind/scripts/code_gen/final_validation.py Outdated

Comment thread CoderMind/scripts/lang_parser/config/cpp.py

HuYaSen requested a review from Copilot June 15, 2026 06:45

Copilot started reviewing on behalf of HuYaSen June 15, 2026 06:45 View session

Copilot AI reviewed Jun 15, 2026

View reviewed changes

Comment thread CoderMind/scripts/smoke_test.py

Comment thread CoderMind/scripts/lang_parser/registry.py Outdated

Comment thread CoderMind/scripts/skeleton/skeleton_models.py Outdated

Comment thread CoderMind/scripts/rpg/service.py

Comment thread CoderMind/scripts/code_gen/final_validation.py

HuYaSen requested a review from Copilot June 15, 2026 07:08

Copilot started reviewing on behalf of HuYaSen June 15, 2026 07:09 View session

Copilot AI reviewed Jun 15, 2026

View reviewed changes

Comment thread CoderMind/scripts/feature/spec.py

Comment thread CoderMind/scripts/rpg_encoder/prompts/parse_prompts.py

Comment thread CoderMind/pyproject.toml Outdated

HuYaSen and others added 12 commits June 15, 2026 15:28

[codermind] Clean non-header imports.

22e0e18

[cmind] Update cmind init hint expression.

5f1f478

fix(codermind): surface non-python interface review follow-ups

40b9fd0

fix(plan): sync interface backfill into RPG updates

2201b16

QingtaoLi1 self-requested a review June 25, 2026 01:49

QingtaoLi1 added 3 commits June 25, 2026 08:08

fix(dep-graph): patch multilingual dependency edge gaps

e3312d7

- Exclude skip directories themselves from dep graph scans - Defer non-Python semantic edges during add_file - Preserve Go single-line function body invokes - Resolve TypeScript namespace import member calls

QingtaoLi1 approved these changes Jun 25, 2026

View reviewed changes

QingtaoLi1 merged commit 2265cf2 into main Jun 25, 2026
2 checks passed

This was referenced Jun 25, 2026

Fix multilingual pipeline bugs and restore extracted tests #81

Closed

Harden multilingual decoder verification and restore follow-up test coverage #82

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: multi-language support for the encoder and decoder pipelines#67

feat: multi-language support for the encoder and decoder pipelines#67
QingtaoLi1 merged 82 commits into
mainfrom
dev/decoder-multilang-pipeline

HuYaSen commented Jun 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

HuYaSen commented Jun 15, 2026

Summary

Encoder

Decoder

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants