Skip to content

feat: multi-language support for the encoder and decoder pipelines#67

Merged
QingtaoLi1 merged 82 commits into
mainfrom
dev/decoder-multilang-pipeline
Jun 25, 2026
Merged

feat: multi-language support for the encoder and decoder pipelines#67
QingtaoLi1 merged 82 commits into
mainfrom
dev/decoder-multilang-pipeline

Conversation

@HuYaSen

@HuYaSen HuYaSen commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR extends CoderMind's encoder (repository → RPG graph) and decoder (docs → repository) pipelines from Python-only to multi-language, adding support for Python, JavaScript, TypeScript, Go, C, C++, and Rust.

All language differences are isolated behind two abstraction layers:

  • lang_parser — a tree-sitter–based parser module giving the encoder uniform code-structure and dependency extraction.
  • decoder_lang backends — one backend per language encapsulating file layout, entry points, test commands, code structure, and build environment; every decoder stage (plan / codegen / verification) goes through these backends.

Encoder

  • New standalone lang_parser module (tree-sitter backend + 7 language parsers/configs) wired into the RPG / dep-graph / encoder pipeline.
  • Language metadata flows through refactor_tree NodeMetaData; rpg.json's meta.language reflects the real repo instead of always python.
  • Dominant-language detection; add_file and semantic re-runs dispatch per language; non-Python class-like units normalized.
  • rpg.json single source of truth (dep_graph embedded); atomic writes eliminate half-written files.
  • C/C++ calls resolve to the definition, not the header prototype; scan exclusions centralized and per-commit LLM exclusion dropped (perf).

Decoder

  • Seven language backends with a unified protocol: entry points, test commands, inheritance, code structure, build env, unit-kind.
  • Planning: language metadata flows through feature spec → build → tree → skeleton → interfaces → tasks; interface design/review works for every language with language-neutral prompts; non-Python files ordered by real imports, not Python AST.
  • Code generation: codegen / post-verify / global-review / subtree-review / final-test all route through the backend and use native test commands (go test, npm, cargo, ctest, …); non-Python projects no longer emit Python files or run venv/pip setup.
  • Verification: on-disk source fallback for language resolution; zero-test "no-op pass" rejected; bounded repair loop in final-test; unified git branch-name sanitization.
  • Entry coordination: <MAIN_ENTRY> and the skeleton entry are reconciled across all 7 languages to avoid a duplicate main.
  • Interface review (final fix): unit-level and feature-level orphan checks unified into a single shared predicate, removing spurious WARNs on framework roots/factories/callbacks; entry-point name matching normalized (RunMainfunction RunMain).

HuYaSen added 30 commits June 4, 2026 13:20
Adds a self-contained source-parsing module supporting Python, Go,
TypeScript, JavaScript, C, C++, and Rust. The module is structurally
isolated — no other code imports it yet; this commit only brings the
module + its unit tests in. A follow-up commit will wire it into
scripts/rpg/* and scripts/rpg_encoder/* so the encoder actually parses
non-Python sources.

Layout
- scripts/lang_parser/
  - base.py          BaseLanguageParser ABC
  - models.py        LanguageConfig / LPCodeUnit / LPDependency / LPFileResult
  - registry.py      detect_language / parse_file / validate_syntax /
                     is_supported_source / is_test_file /
                     get_parser{,_for_file} / get_config{,_for_path} /
                     markdown_fence_for_path
  - tree_sitter_backend.py  Lazy wrapper around optional tree-sitter
                            grammar packages (missing grammar only
                            disables one language, never crashes import)
  - config/          One per language: extensions / test globs /
                     tree-sitter language name / node-type vocab
  - <lang>_parser.py One per language; _c_family_parser.py and
                     _ecmascript_parser.py hold shared logic
  - extractors/      Fallback unit extraction
- tests/test_lang_parser_*.py  66 tests covering registry + each parser

API contract
- Public surface is locked via lang_parser/__init__.py's __all__. Callers
  must import from the top level (e.g. 'from lang_parser import
  parse_file'); reaching into lang_parser.python_parser etc. is reserved
  for tests.

Dependencies (added to pyproject.toml)
- tree-sitter-go / tree-sitter-typescript / tree-sitter-javascript /
  tree-sitter-c / tree-sitter-cpp / tree-sitter-rust. All grammars are
  loaded lazily through tree_sitter_backend; a user who only wants
  Python support can uninstall any subset and the rest of the module
  keeps working.

Verification
- 'pytest tests/test_lang_parser_*.py' -> 66 passed.
- Module import smoke-test confirms all 15 public symbols load.
Builds on the standalone lang_parser module landed in the previous
commit. The Python path is preserved unchanged everywhere; non-Python
sources (Go, TypeScript / JavaScript, C / C++, Rust) now flow through
parallel lang_parser branches and produce the same RPG node + edge
shape downstream consumers already expect.

rpg/models.py
- NodeMetaData gains a 'language' field (Optional[str]) propagated by
  to_dict / from_dict. Legacy artefacts that omit the field load as
  language=None.

rpg/dep_graph.py
- import lang_parser at module level.
- _exclude_irrelevant_for_parse now accepts any language registered with
  lang_parser and uses lang_parser.is_test_file for per-language test
  detection (replaces hardcoded '.py' + test_/_test.py heuristics).
- DependencyGraph.__init__ adds a lazy Go module-path cache.
- New class constants: _LP_CLASS_LIKE_UNIT_TYPES, _LP_NODE_TYPES,
  _TS_JS_IMPORT_EXTENSIONS, _C_FAMILY_EXTENSIONS, _RUST_EXTENSIONS.
- parse() dispatches on lang_parser.detect_language: Python keeps the
  original ast path (unchanged), non-Python routes through
  _parse_lp_file_result + _parse_lp_invoke_dependencies.
- ~27 new methods adapt LPFileResult into RPG nodes/edges and resolve
  cross-language dependencies: _parse_lp_file_result,
  _parse_lp_invoke_dependencies, _parse_lp_dependencies,
  _lp_* (node id / attrs / type helpers), _add_lp_* (edge helpers),
  _resolve_lp_* (cross-file import + invoke resolution),
  _resolve_{ecmascript,go,c,rust}_invoke, _is_c_family_import,
  _is_rust_import, _resolve_rust_* (mod/super/crate/path helpers),
  _resolve_go_module_import + _read_go_module_path, plus the import
  placeholder helpers (_ensure_lp_import_placeholder, ...).
- Python file nodes are now also tagged with language='python' so
  downstream tooling can branch uniformly on the language attribute.

rpg/code_unit.py
- ParsedFile.__init__ tries lang_parser first (_parse_with_language_parser)
  and falls back to the original ast path. Python stays on the ast path
  for behavioural parity. _code_units_from_parser_result adapts
  LPFileResult units into CodeUnit, preserving language / line range in
  extra.
- CodeUnit.lineno / end_lineno fall back to extra['line_start' /
  'line_end'] when self.node is not an ast.AST (i.e. lang_parser units).
- CodeSnippetBuilder.generate_code_snippet skips the ast.parse import /
  assignment scan when the language is not Python (a new
  _language_for_units helper detects via unit.extra['language'] or
  file_path).
- CodeSnippetBuilder.build now emits the correct markdown fence per
  file via lang_parser.markdown_fence_for_path (ts -> typescript, etc).

rpg_encoder/rpg_evolution.py
- _filter_non_test_py_files keeps its historical name but accepts any
  lang_parser-supported source via is_supported_source +
  is_test_file. Diff-driven incremental update paths (added / deleted /
  modified file filters) use is_supported_source too.

rpg_encoder/rpg_encoding.py
- skeleton's valid_files filter uses is_supported_source AND
  not is_test_file.

rpg_encoder/semantic_parsing.py
- file enumeration uses is_supported_source AND not is_test_file.

tests
- tests/test_multilingual_code_unit.py:   4/4 pass
- tests/test_multilingual_dep_graph.py:   5/6 pass
- tests/test_multilingual_encoder_pipeline.py: 2/4 pass
- tests/test_multilingual_prompt_safety.py: 0/3 pass

Known failures (intentionally deferred to a follow-up PR):
- test_multilingual_prompt_safety: requires editing scripts/rpg_encoder
  prompt strings to remove Python-only wording ('Python classes',
  '.py files', etc). Pure prose work, orthogonal to this change.
- test_multilingual_encoder_pipeline::test_refactor_tree_assigns_*:
  requires propagating language metadata through scripts/rpg_encoder/
  refactor_tree.py — the file has independently evolved on main and
  needs a separate audit to avoid clobbering its current behaviour.
- test_multilingual_encoder_pipeline::test_go_repo_enters_semantic_*:
  exercises a deeper semantic_parsing code path (group_units +
  per-language prompts) that this PR does not touch.
- test_multilingual_dep_graph::test_incremental_update_keeps_typescript_*:
  exercises the incremental-update edge-replay path which is
  orthogonal to the first-pass parse done here.

Verification (rpgkit conda env on this checkout)
- pytest tests/test_lang_parser_*.py                  -> 66 / 66 pass
- pytest tests/test_rpg_models.py tests/test_rpg_encoding.py
         tests/test_e2e.py tests/test_integration.py  -> no new failures
  (1 pre-existing failure on main; verified by git-stash bisection)
Self-review of the previous wire-in commit caught three places where
the multi-language path was incomplete or inconsistent. None of these
affect the Python-only happy path, but each silently degrades the
multi-language story.

dep_graph.py: reparse_ast() must mirror parse() dispatch
  Previously, after a from_dict round-trip, reparse_ast unconditionally
  ran ast.parse on every file node — which raises SyntaxError on Go /
  TS / C / Rust / ... and silently skipped them, leaving non-Python
  files with no 'ast' attribute and therefore no rebuilt import /
  invoke / inherit edges. Now it dispatches on lang_parser.detect_language
  exactly like parse() does and routes non-Python through
  _parse_lp_file_result + _parse_lp_invoke_dependencies.

rpg_evolution.py: incremental-diff filters now also exclude tests
  The added / deleted / modified filters were using is_supported_source
  but had lost the 'and not is_supported_test_file' check during the
  wire-in. Restored parity with the old repo so the encoder doesn't
  treat newly added test files (e.g. main_test.go) as production code
  on incremental update.

models.py: infer_type_name_from_path now recognises non-Python sources
  The fallback inference returned 'directory' for any path that didn't
  end in '.py', which would mislabel .go / .ts / .rs files as
  directories when metadata-driven NodeType inference is exercised.
  Delegates to lang_parser.is_supported_source with a graceful
  ImportError fallback so the legacy Python-only behaviour is
  preserved when lang_parser is not installed.

Verification
- syntax check on all three files: OK
- pytest tests/test_lang_parser_*.py + tests/test_multilingual_*.py +
  tests/test_rpg_models.py + tests/test_e2e.py + tests/test_integration.py
  -> identical pass/fail to before this commit (no regression). The 6
  deferred multilingual failures and 1 pre-existing main failure
  remain, unchanged.
…n ParsedFile

semantic_parsing.py groups units by unit_type == 'class' / 'function'
to drive its parse_classes / parse_functions LLM batches. Until now,
lang_parser-produced kinds like Go 'struct', Rust 'enum'/'trait',
TS/Go 'interface', C/C++ 'struct' were forwarded verbatim — so they
fell into neither bucket and were silently dropped from the LLM input.

Normalise these to 'class' inside _code_units_from_parser_result, keep
the original kind in extra['lp_kind'] so RPG-side renderers can still
recover it, and update the parity test to lock in the new contract.

Verified end-to-end via tests/test_multilingual_encoder_pipeline.py::
test_go_repo_enters_semantic_parsing_with_non_empty_units (previously
KeyError, now passes).
…MetaData

RefactorTree now accepts a repo-wide 'language' default plus an
optional 'language_map' (path-prefix -> language) and stamps each
NodeMetaData it produces (FILE / CLASS / FUNCTION / METHOD) with the
resolved language so downstream tooling can branch on
node.meta.language instead of guessing from extension.

- __init__: add 'language' (default 'python') and 'language_map'
  kwargs; normalise prefixes via _normalise_lang_prefix.
- _resolve_language(path): longest-prefix match against language_map,
  falls back to self.language; returns the default for None/empty.
- 8 NodeMetaData sites in run() (instance path) and
  refactor_new_files() (classmethod path) now pass
  language=self._resolve_language(file_path) /
  instance._resolve_language(file_path).

Verified by tests/test_multilingual_encoder_pipeline.py::
test_refactor_tree_assigns_language_metadata_to_go_and_typescript_nodes
which now passes; previously asserted None for every type.
…mpts

The parse/encoding prompts predate the multi-language wire-in and
talked exclusively about Python: 'Python classes', 'Python repository',
'__init__/__new__/__repr__' as canonical examples, '.py only' scope
clauses, and Python-stack typing examples ('pandas.DataFrame',
'pyarrow.Table'). For Go / TS / C / Rust repos these phrasings bias
the LLM toward Python idioms and at worst tell it to ignore non-Python
files entirely.

- parse_prompts.PARSE_CLASS: 'classes' -> 'class-like constructs
  (classes, structs, interfaces, traits, enums, ...)'; drop
  __init__/__new__/__repr__ examples; rewrite DataLoader example to
  use a language-neutral 'new_loader' method name.
- parse_prompts.PARSE_FUNCTION: 'standalone Python functions' ->
  'standalone (module-level) functions across any supported language'.
- encoding_prompts.EXCLUDE_FILES: drop the '.py only' scope block;
  switch to 'source files in supported languages'.
- encoding_prompts.ANALYZE_DATA_FLOW: 'Python repository' ->
  'source repository'; replace pandas/pyarrow type examples with
  generic 'UserRecord' / 'User'.
- tests/test_multilingual_prompt_safety.py: update the schema
  assertions to match the richer {feature: description} payload the
  prompts actually emit (the legacy [feature1, feature2] array
  shape was already gone before this scrub).
Two related holes in the incremental dep_graph update path that
silently dropped all non-Python semantic edges:

1. add_file() always ran ast.parse on the file content. For Go / TS /
   C / Rust this raised SyntaxError, was logged at DEBUG, and the
   file node was kept with NO units, NO ast attr, and NO language
   attr. update_files() therefore had nothing to feed into the
   downstream semantic passes for any modified non-Python file.

2. _rerun_semantic_passes() only walked nodes with an ast attr. Even
   after fix (1) populates a language attr, the import/inherit/invoke
   passes were ast.Module-specific and produced zero edges for the
   newly-parsed lang_parser files.

Fixes:

- add_file(): detect_language(nid) first; Python keeps the original
  ast path, anything else routes through lang_parser.parse_file +
  _parse_lp_file_result (matching :meth:). Invoke resolution
  is deliberately deferred to _rerun_semantic_passes so cross-file
  calls see the final unit registry.
- _rerun_semantic_passes(): after the ast-based passes, walk every
  FILE node whose language is non-None and non-Python, re-run
  lang_parser.parse_file using the cached 'code' attr, replay
  _parse_lp_file_result, then run _parse_lp_invoke_dependencies in a
  second pass so cross-file invoke targets resolve.

Verified by tests/test_multilingual_dep_graph.py::
test_incremental_update_keeps_typescript_import_edges which now passes;
previously the post-update graph had no app.ts:run node and no
imports edge.
…ects the repo, not always 'python'

Found via codermind-bench: every multi-language repo (Go/Rust/TS/...) had rpg.json feature nodes stamped with meta.language='python' even though dep_graph.json correctly reported the real language. Root cause was RefactorTree's language default ('python') silently winning because RPGParser never passed one.

Fix:

  - rpg_encoding.RPGParser now derives the dominant language from self.valid_files via a new module-level helper _dominant_language() (uses lang_parser.registry.detect_language; ignores files with unknown/unsupported extensions; returns None only if every file is unknown). The result is passed to RefactorTree as language=. The log line surfaces the choice.

  - refactor_tree.RefactorTree: language kwarg now Optional[str]=None (was 'python'), self.language typed Optional[str], _resolve_language() returns Optional[str]. Callers that never set a language now get meta.language=None instead of a fabricated value.

Verified by re-running cobra (Go): rpg.json now shows {'go': 207} where it previously showed {'python': 205}. All 96 lang_parser / multilingual tests still pass; full suite shows identical 30 pre-existing flakes as origin/main, +96 new passes, 0 new failures.
…factor classmethods

Audit follow-up to 5f9ddf6. RefactorTree.refactor_new_files() and refactor_modified_files() (used by the post-merge / incremental update path in rpg_evolution) also instantiated 'cls(...)' without language=, so files added or touched after the initial encode would land with meta.language=None even though the same repo's initial-encode rpg.json had it correct.

- lang_parser.registry: extract _dominant_language() into a public dominant_language(paths) helper (accepts any iterable; deterministic tie-break). Re-export via lang_parser package.

- rpg_encoding: switch RPGParser.parse_rpg_from_repo() to the shared helper (drop the local _dominant_language).

- refactor_tree: refactor_new_files() and refactor_modified_files() now compute the dominant language from parsed_tree.keys() and pass it to cls(language=...). Log line surfaces the choice when known.

Net change vs upstream: 96 lang_parser + multilingual tests pass; full suite shows identical 30 pre-existing flakes as origin/main with +96 net passes.
Introduce a LanguageBackend strategy interface that lets the decoder
pipeline (skeleton / func_design / code_gen) treat the target
programming language as a parameter rather than the historical
hard-coded .py / stdlib ast / pytest assumptions.

Behavioural invariants:
* All existing Python pipelines are byte-equivalent -- full decoder
  test suite holds at 1055 passed / 30 pre-existing flakes for every
  intermediate phase as well as the final state.
* Encoder bench on cobra (Go) passes end-to-end; new package imports
  cleanly in subprocess env without polluting encoder behaviour.

Phase 0 -- abstraction layer (scripts/decoder_lang/)
  backend.py        : LanguageBackend Protocol + registry + ToolchainUnavailable
  prompt_hints.py   : PromptHints dataclass
  test_result.py    : TestRunResult / TestFailure / EnvHandle dataclasses
  python_backend.py : Behaviour-preserving Python implementation
  Trial wiring: code_gen/static_checks.py replaces a single
  ``suffix == '.py'`` literal with ``backend.is_source_file(path)``,
  proving the abstraction works without changing behaviour.

Phase 1 -- target_language propagation
  feature/schemas/spec.py: optional ``target_language`` field on
  FeatureSpecOutput (defaults to None; legacy artefacts load
  unchanged).
  decoder_lang/backend.py: ``resolve_decoder_language`` with a
  4-tier fallback chain (feature_spec -> RPG root meta ->
  dominant_language -> python with WARNING).
  skeleton/file_designer.py: FileDesigner.__init__ accepts
  ``target_language``, stores resolved backend on ``self.backend``.

Phase 2 -- skeleton multi-language + GoBackend skeleton subset
  decoder_lang/go_backend.py: skeleton-relevant subset (file
  extension, identifier rules, package marker = None, prompt
  hints); AST/test methods raise NotImplementedError until Phase
  3/4.
  skeleton/skeleton_models.py: ``add_init_files`` accepts optional
  ``backend`` parameter; backends whose ``package_marker_filename``
  returns None turn it into a no-op (Go/Rust/TS).
  skeleton/file_designer.py: three ``misc.py`` literals replaced
  with ``misc{self.backend.file_extension}``;
  ``validate_directory_structure`` accepts a backend so Go path
  segments are checked against Go naming rules (reject hyphens,
  keywords like ``func``).

Phase 3 -- func_design migrates from stdlib ast to lang_parser via backend
  Gap analysis showed lang_parser's python_parser already stuffs
  the raw ast node into ``LPCodeUnit.extra['ast_node']``, so no
  schema change is needed.
  decoder_lang/backend.py: adds ``list_code_units``,
  ``format_signature``, ``list_imports`` to the Protocol.
  decoder_lang/python_backend.py: implements all three plus
  ``find_main_block_lineno`` (Python-only hook, feature-detected
  via getattr).
  decoder_lang/go_backend.py: stubs the three Protocol methods.
  func_design/interface_agent.py: ~22 ast call-sites migrated to
  ``backend.*``. Python-specific helpers (``_extract_name_from_node``,
  ``_extract_type_names``, ``ast.get_docstring`` for docstring
  inspection) stay because they are intrinsically Python-AST-shaped.
  func_design/interface_review.py: ``_insert_unit_into_file_code``
  routes through ``backend.find_main_block_lineno``; ``import ast``
  removed.
  func_design/base_class_agent.py: unused ``import ast`` removed.

Phase 4a -- code_gen helpers route through backend (scoped subset)
  code_gen/prompts.py: class+method walker uses
  backend.list_code_units; ``import ast as _ast_mod`` removed.
  code_gen/batch_prompts.py: signature summary uses
  backend.list_code_units + raw ast node (preserves param/return
  rendering byte-for-byte).
  code_gen/test_runner.py: import scanner uses
  backend.list_imports; ``import ast`` removed.
  Deliberately NOT migrated (Python-AST intrinsic, would force-fit
  wrong abstraction): static_checks.py body-shape detection,
  context_collector.py SQLAlchemy ORM heuristics, rpg_updater.py
  call-site walker.

Phase 5 -- language directive preamble for LLM prompts
  decoder_lang/prompt_directive.py: ``language_directive(backend)``
  returns an empty string for Python (zero-impact) and a 4-line
  preamble (display name, style directive, fence reminder,
  framework hint) for other languages.
  skeleton/file_designer.py: all three LLM call sites wrap the
  system prompt with
  ``with_language_directive(prompt, self.backend)``.
  Bulk substitution of literal "Python" / ".py" / "pytest" across
  the prompt body files is deferred until a real Go decoder run
  surfaces concrete failures to fix.

Test coverage
  101 new tests in decoder_lang/tests/ (Phases 0/1/2/3/5).
  Full decoder regression suite: 1055 passed / 30 pre-existing
  flakes (identical to dev/lang-parse-module-on-main baseline) at
  every intermediate and final state.

Planning notes (gitignored, under CoderMind/plans/)
  decoder_multilang.md            : architecture + 6-phase plan
  decoder_multilang_phase3_gap_analysis.md
  decoder_multilang_test_runbook.md
…f-files)

Route the five core json write sites in the encoder + post-commit
hook through ``common.rpg_io.atomic_write_rpg`` so a process kill
in the middle of serialisation can never leave a truncated file
behind. Bench mid-run kills (e.g. ``./bench encode-repos`` while a
worker is mid-encode) used to corrupt the cache and required
manual cleanup; with this change the original file always survives
intact and the ``.tmp`` is removed automatically.

Failure modes the new path handles vs the old ``open("w") +
json.dump`` pattern:

* SIGKILL mid-write          : old file intact, .tmp orphaned
* Serialiser TypeError       : old file intact, .tmp cleaned up
* Disk full mid-flush        : old file intact, .tmp cleaned up
* fsync failure              : exception raised, no rename
* Successful path            : .tmp renamed atomically over target

To support the encoder rounds that previously needed ``default=``
for non-serialisable objects (NodeMetaData etc.),
``atomic_write_rpg`` now forwards ``**dump_kwargs`` to
``json.dump`` so callers migrate without losing custom serialiser
hooks.

Sites migrated:

  CoderMind/scripts/rpg/models.py
    save_json          : every RPGService.save call site
    save_dep_graph     : post-commit hook + initial encode
  CoderMind/scripts/rpg_encoder/run_encode.py
    initial rpg.json    write
    initial dep_graph.json write
  CoderMind/scripts/rpg_encoder/rpg_encoding.py
    two intermediate rpg.json writes (use the new dump_kwargs path
    for ``default=`` round-tripping of encoder-internal objects)
  CoderMind/scripts/update_graphs.py
    post-commit hook dep_graph.json refresh

Tests:
  CoderMind/tests/test_rpg_io.py (+2)
    test_forwards_dump_kwargs            : kwarg propagation
    test_no_partial_file_on_serialise_failure
      Verifies the cleanup contract the bench kill case kept hitting.

Regression: 1057 passed / 30 pre-existing flakes (1055 baseline +
2 new atomic-write tests).
Follow-up to ac853f7. Audit found four additional ``json.dump``
sites that serialise RPG-derived payloads but were missed in the
first sweep — same half-file risk, same one-line fix using
``common.rpg_io.atomic_write_rpg``.

Sites migrated:

  CoderMind/scripts/rpg_encoder/run_update_rpg.py
    ``cmind update-rpg`` main rpg.json write. The previous pass
    covered run_encode.py / rpg_encoding.py but the incremental
    update path used its own bare ``open("w") + json.dump`` and
    so the half-file failure mode survived for users running
    ``cmind update-rpg`` after the encoder's initial run.

  CoderMind/scripts/rpg_encoder/version_control.py
    ``RPGVersionControl.save_version``. The version history file
    embeds ``rpg.to_dict()``; a killed save left a half-written
    rpg.vN.json that ``rollback(version=N)`` then failed to parse.
    ``RPGVersionControl.rollback`` was already using
    ``atomic_write_rpg`` for the symmetric main-file write since
    ac853f7, so this commit aligns the save side with the load side.

  CoderMind/scripts/rpg_encoder/rpg_evolution.py
    ``RPGEvolution.process_diff`` save_path block. Embeds
    ``rpg.to_dict()`` as ``rpg.structure`` plus a feature_tree and
    diff summary; mid-write kill left a truncated diff artefact
    that ``cmind diff`` / debug tools failed to read on next load.

  CoderMind/scripts/code_gen/stage_io.py
    ``save_stage_result`` for ``codegen_<name>.json`` sidecars
    (final_test / smoke_test / global_review). global_review loads
    all three as context; a killed earlier stage used to leave a
    truncated sidecar that surfaced as a JSONDecodeError later in
    the pipeline (looked like a global_review bug). Uses the
    ``default=str`` forwarding extension to ``atomic_write_rpg``
    added in ac853f7 to preserve the original Path / datetime
    fallback serialiser without behaviour change.

Tests:

  No new tests added — the contract (no partial file on kill or
  serialise error) is already covered by
  ``test_no_partial_file_on_serialise_failure`` in
  CoderMind/tests/test_rpg_io.py from ac853f7. Existing test
  suites for the migrated sites
  (``test_workflow_integration::TestRPGVersionControl``,
  ``test_rpg_evolution``) continue to pass.

Regression: same 29 pre-existing flakes as before this commit
(test_initial_encode_prompt: log location mismatch with
~/.cmind workspace dir; test_step3_polish: hook install API
drift; test_rpg_evolution::test_update_dep_graph_index_no_crash:
unrelated to atomic write). 991 passed unchanged. Bench smoke
batch (5 repos, parallel 3) re-runs with all-PASS verdicts:
chalk 289s / cobra 488s / requests 550s / sds 621s.
The dep_graph has been duplicated for years: ``run_encode.py`` wrote
both ``rpg.json`` (which embeds it via ``RPG.to_dict(include_dep_graph
=True)``) AND a standalone ``dep_graph.json``. The post-commit hook
then refreshed only the standalone file, so ``RPGService.load`` had
to override the embedded copy with the external one on every read.
Two consequences:

* ~650 KB of duplicated bytes per encode on disk.
* Subtle drift: ``RPG.load_json`` (encoder side) returned the embedded
  copy while ``RPGService.load`` (decoder side) silently overrode it
  with whatever the hook had last persisted. Bench runs eventually
  surfaced this through "rpg.json says X, dep_graph.json says Y"
  diagnostics.

This commit makes ``rpg.json`` the single source of truth:

* Encoder no longer writes a standalone ``dep_graph.json``; the
  in-memory ``rpg.dep_graph`` is embedded by ``RPG.to_dict`` and
  persisted via the existing atomic ``rpg.json`` write.
* All downstream writers (``update_graphs.py`` modes, ``run_batch``
  codegen refresh, ``rpg_encoder.run_update_rpg``, ``RPGEvolution``
  process_diff path) stop passing ``save_path`` for the dep_graph;
  they rely on ``svc.save(rpg_path)`` to roll the dep_graph into
  rpg.json.
* ``RPGService.load`` flips its read order: embedded first, legacy
  external ``_dep_graph_file`` only as a backward-compat fallback
  (logs an INFO when used so silent migrations remain visible).
* All known reader sites either already preferred embedded
  (``rpg_visualize.py::load_rpg``, ``rpg_edit/validate.py``) or now
  do (``update_graphs.py::update_feature`` mode).
* ``RPG.save_dep_graph`` / ``RPG.load_dep_graph`` keep their public
  signature: callers that still want a standalone snapshot for
  debugging (and the legacy CLI ``--dep-graph`` flags) keep working.
* ``--dep-graph`` CLI flags stay so legacy workspaces with an
  existing ``dep_graph.json`` continue to load via the compat path.

Sites touched (10 files):

  CoderMind/scripts/rpg_encoder/run_encode.py
    Delete the 40-LOC standalone dep_graph.json write block.
    dep_graph parses + embeds via rpg.to_dict() as before; no
    second-file write.

  CoderMind/scripts/rpg/service.py
    - ``RPGService.load``: prefer embedded dep_graph over external
      ``_dep_graph_file``; INFO log on the legacy fall-through.
    - ``sync_from_commit_diff``, ``sync_from_file_list``,
      ``_apply_incremental_dep_graph_update``: ``save_path`` becomes
      ``Optional[...]`` (default ``None``) so callers can rely on the
      embedded copy.

  CoderMind/scripts/update_graphs.py
    All 6 modes (dep / mapping / feature / full / enrich / sync)
    drop the standalone write. ``dep`` mode prefers rpg.json
    (only falls back to writing dep_graph.json when no rpg.json
    exists yet — the very-first pre-commit hook case). All writers
    clear ``_dep_graph_file`` so RPGService.load doesn't fall through
    on the next read.

  CoderMind/scripts/run_batch.py
    ``_refresh_dep_graph_safe`` (codegen path) stops passing
    save_path. dep_graph rides inside rpg.json via the subsequent
    ``svc.save(rpg_path)``.

  CoderMind/scripts/rpg_encoder/rpg_evolution.py
    ``_update_dep_graph_index``: legacy "may be stale" WARNING
    downgraded to INFO "embeds into rpg.json on save" — the new
    default path.

  CoderMind/scripts/rpg_encoder/run_update_rpg.py
    Stops passing ``dep_graph_save_path`` to ``process_diff``.

  CoderMind/scripts/rpg_edit/validate.py
    Already preferred embedded; updated error text to mention
    ``/cmind.encode`` (which produces embedded) rather than the
    obsolete ``cmind script update_graphs.py sync`` standalone path.

  CoderMind/scripts/rpg/models.py
    Deprecation note on ``save_dep_graph`` docstring.

  CoderMind/scripts/common/paths.py
    Deprecation note on ``DEP_GRAPH_FILE`` constant.

Tests:

  CoderMind/tests/test_step4_integration.py
    - ``test_update_dep_graph_index_writes_dep_graph_json`` →
      ``test_update_dep_graph_index_populates_in_memory_dep_graph``
      (asserts in-memory + embedded round-trip via to_dict).
    - ``test_update_dep_graph_index_save_path_outside_rpg_dir`` →
      ``test_update_dep_graph_index_legacy_save_path_still_writes_standalone``
      (verifies the legacy compat path still works for callers that
      do pass save_path).
    - ``test_update_dep_graph_index_without_save_path_logs_warning`` →
      ``test_update_dep_graph_index_without_save_path_logs_info``
      (the WARNING→INFO downgrade — new default path is no longer
      a degraded mode).
    - ``test_process_diff_threads_dep_graph_save_path`` →
      ``test_process_diff_embeds_dep_graph_into_rpg``.
    - ``test_run_update_rpg_advances_meta_git_and_runs_align``:
      assertion flipped from "dep_graph.json exists on disk" to
      "rpg.json has embedded ``dep_graph`` field".

Regression: 990 → 991 passed (the 1-test delta was the renamed
``test_update_dep_graph_index_without_save_path_logs_info`` taking
the slot of the obsolete WARNING test).  29 pre-existing flakes
unchanged.

Manual verification:

* Loaded all 5 real bench rpg.json artefacts (chalk / cobra / requests
  / sds / aho-corasick) via ``RPGService.load`` — all reconstruct
  dep_graph from the embedded copy with the expected node/edge/mapping
  counts.
* Simulated new-encoder output by stripping ``_dep_graph_file`` and
  the standalone ``dep_graph.json`` from a tmpdir copy of cobra's
  rpg.json; ``RPGService.load`` still loaded 438 dep_nodes / 250
  mappings from the embedded copy. Re-save preserved embed and
  cleared the legacy pointer field.
Replace migration-phase labels and archaeological comments with
present-tense descriptions of the current behaviour. The cleanup covers
runtime code, decoder-language tests, and workflow comments so source
files no longer refer to implementation phases such as `Phase 3` or to
past parser paths that only make sense when reading the original plan.

The remaining `previously` matches are prompt text that describes user
state across LLM interactions, not source-code history. Those prompts
need the temporal wording to preserve their semantics.

Tests:

  CoderMind/scripts/decoder_lang/tests
    Verify backend registry, language resolution, skeleton wiring,
    code-structure helpers, and prompt directives still behave as before.

  CoderMind/tests/test_rpg_io.py
  CoderMind/tests/test_step4_integration.py
  CoderMind/tests/test_workflow_integration.py
    Verify the adjacent encoder and RPG IO paths touched by the latest
    refactors still pass.

Regression: 184 passed / 60 subtests passed.
Implement Go code-unit discovery, import listing, syntax checks, environment detection, test commands, dependency install commands, and go test output parsing behind the existing LanguageBackend contract.

Update decoder_lang tests to cover the Go backend behavior while keeping Python default package-marker behavior explicit.
Promote list leaves through a small normalizer so string leaves, named dict leaves, and single-key dict leaves can become branch nodes without using unhashable dicts as keys.

Add regression coverage for dict leaves emitted by feature expansion.
Restore compatibility for direct encode checks, mocked edge statistics, dep-graph code metadata, initial encode progress parsing, and home-side log assertions.

Update tests to match the current post-commit/post-merge dispatcher contract and canonical double-colon RPG paths.
Keep decoder_lang package documentation aligned with the implemented Go backend capabilities.
Add target_languages alongside target_language, infer missing language hints from requirement docs, and propagate the primary language through feature_build, feature_refactor, RPG construction, and build_skeleton.

This keeps existing scalar consumers compatible while allowing multi-language specs to carry an ordered language list.
Move decoder language metadata into the feature artifact meta block so
generated outputs have one canonical place for primary and target
languages.

Thread that metadata through feature construction, skeleton planning,
data flow, base class design, interface design, RPG creation, and task
planning. Add focused coverage for Go plan-stage behavior and tighten
feature_construct checks so missing or inconsistent language metadata is
reported early.
Register Rust and TypeScript decoder backends so planning stages use
target-language parser, prompt, and validation behavior instead of falling
back to Python.

Add language-specific project tasks for Rust and TypeScript and make
plan warnings fail the pipeline so partial interface coverage cannot be
reported as a successful plan.
Track generated interface coverage in the interface orchestrator and
return a failing status when files or skeleton features remain uncovered.

Add backend-owned project task templates so future language support can
provide dependency, entrypoint, and README prompts from the backend
instead of requiring new plan_tasks branches.
Reuse already-complete interface subtrees when plan reruns from a warning state.

This prevents design_interfaces from regenerating successful prefixes and lets
strict plan resumes continue from the first incomplete subtree instead of
repeating long LLM calls.
Print subtree restore and coverage progress during design_interfaces so long plan runs can be tracked from stdout without digging into trajectory files.
Remove Python-only pass/docstring wording from interface prompts and skip the Python-specific global interface review for non-Python targets.

This keeps Rust and TypeScript interface design focused on target-language declaration stubs instead of Python repair behavior.
Remove the outer whole-subtree retry from interface generation.

The inner per-subtree loop already retries missing files and features; repeating the entire subtree can turn small multilingual plan fixtures into long-running loops before surfacing incomplete coverage.
Accept TypeScript declare function/class/type/interface/enum declarations during interface validation.

This prevents valid declaration-only interface snippets from being rejected as having no target-language declarations.
Use backend-safe docstring detection for Python interface validation and avoid injecting Python import conventions into non-Python interface prompts.

Reconcile global review counts after orphan retain/prune decisions so retained Python entry points no longer leave stale orphan counts that make bench report WARN.
Normalize full Markdown code fences before interface syntax validation.

This lets Go, Rust, TypeScript, and Python validators accept common LLM outputs where the code field contains a fenced source block.
Strip TypeScript comments before parser-backed syntax and code-unit extraction so valid declaration snippets with JSDoc text are not rejected.

This keeps original interface code intact while making validation tolerant of common documentation comments.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 127 out of 127 changed files in this pull request and generated 3 comments.

Comment thread CoderMind/scripts/rpg_visualize.py Outdated
Comment thread CoderMind/scripts/code_gen/final_validation.py Outdated
Comment thread CoderMind/scripts/lang_parser/config/cpp.py
The final-test and smoke-test repair agents told the sub-agent to verify
with a hardcoded pytest command, but final_test now runs the resolved
backend's suite (Go/Rust/TS/JS as well as Python). Build the verify
command from the backend so the repair agent uses the project's real
test tool instead of pytest on a non-Python repo.

Also fix inconsistent indentation in rpg_visualize.load_rpg.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 127 out of 127 changed files in this pull request and generated 5 comments.

Comment thread CoderMind/scripts/smoke_test.py
Comment thread CoderMind/scripts/lang_parser/registry.py Outdated
Comment thread CoderMind/scripts/skeleton/skeleton_models.py Outdated
Comment thread CoderMind/scripts/rpg/service.py
Comment thread CoderMind/scripts/code_gen/final_validation.py
- smoke_test: locate the real entry via entry_point_candidates globs so
  Go's cmd/<name>/main.go is probed instead of silently skipped
- lang_parser: preserve Windows drive letters in path normalization
- final_validation: distinguish toolchain-unavailable from zero-collected
  test runs in the no-op guard
- skeleton_models: drop dead marker_default_body variable
- rpg/service: correct sync_from_file_list docstring (any language, not .py)

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 127 out of 127 changed files in this pull request and generated 3 comments.

Comment thread CoderMind/scripts/feature/spec.py
Comment thread CoderMind/scripts/rpg_encoder/prompts/parse_prompts.py
Comment thread CoderMind/pyproject.toml Outdated
HuYaSen and others added 12 commits June 15, 2026 15:28
- feature/spec: drop bare-word `\bgo\b` from language inference so English
  prose ("go to ...") no longer misdetects Go and picks the wrong backend;
  keep only unambiguous signals (golang/go.mod/go test|run|build/go
  language/go project)
- parse_prompts: use a real constructor name (__init__) in the PARSE_CLASS
  example instead of the synthetic `new_loader`
- pyproject: the tree-sitter grammars are mandatory deps; reword the comment
  so it no longer calls them optional
Commit b8b9bef changed the example method key to "__init__" on review
feedback, but that reintroduces a Python-specific constructor name into
an encoder prompt that must stay language-neutral (it parses Go/Rust/C/
C++/TS/JS too). "__init__" is on the forbidden-term list asserted by
test_multilingual_prompt_safety. Use the real, language-neutral method
name "configure" instead, which still satisfies the "use a real method
name" review point without biasing the model toward Python.
Normalize dep-graph code-unit IDs through the canonical RPG path converter so
non-Python nodes like `src/store.go:Load` can match `src/store.go::Load`.
Run exact/canonical and suffix code-unit matching before falling back to the
parent file node, preventing function/class/method dep nodes from being
coarsely mapped to files when dedicated RPG nodes exist.

Add regression coverage for Python, Go, and TypeScript code-unit mappings.
Update the initial encode progress parser to match the current encoder log message, `Total valid source files to parse`, instead of the old Python-only wording. Adjust progress parser tests and the mocked initial encode stderr to use the current source-file message.
Update the C++ backend's CMake test command to run `ctest` with
`--test-dir <repo>/build`, matching the existing out-of-source
`cmake -S <repo> -B <repo>/build` reconfigure step. Also update the C++ fallback test command shown in codegen prompts so agents use the same CTest invocation.

Add regression coverage for CMake-based C++ test command construction.
Extend related test discovery beyond Python using conservative conventions for Go, Rust, JavaScript/TypeScript, C, and C++ test files. Keep Python's existing scoped pytest behavior, but continue running project-level verification for non-Python backends so file paths are not misused as backend-specific test selectors.

Update post-verification logging to report related test hints separately from the actual project test command, and add regression coverage for multi-language discovery plus non-Python full-suite safety.
Use lang_parser's language-specific test-file rules for supported source files
during dependency-graph build filtering, while preserving the legacy test-ish
filter for non-source files. This prevents valid non-Python sources such as Go
testdata fixtures or TypeScript test helper modules from being excluded before
parse-time filtering can see them.

Add regression coverage for Go testdata, non-Python helper source files,
language-specific test files, and legacy non-source test-ish exclusions.
Allow codegen backend resolution to use on-disk source scanning as a fallback
when feature_spec and RPG metadata do not declare a target language. Pass the
repo path through prompt-building call sites so API summaries and TDD prompts
resolve the same backend, while preserving explicit feature_spec/RPG language
metadata precedence.

Add regression coverage for metadata-free Go repos and explicit metadata
overriding source-scan fallback.
@QingtaoLi1 QingtaoLi1 self-requested a review June 25, 2026 01:49
Add backend-owned file dependency edge resolution for supported languages and
refactor plan_tasks.py to sort files per inferred backend instead of using a
global primary-language import heuristic. Preserve cross-language ordering
conservatively for mixed-language subtrees.

Move non-Python project task guidance fully into backends, update Go entry-point
requirements, and clarify that warning states are cross-stage contract
violations that trigger rebuilds.
- Exclude skip directories themselves from dep graph scans
- Defer non-Python semantic edges during add_file
- Preserve Go single-line function body invokes
- Resolve TypeScript namespace import member calls
- Treat feature-free subtrees as complete when saved file blocks exist
- Improve non-Python signature summaries with name-only and declaration fallback
- Propagate target language through FuncDesigner interface phase
- Add targeted tests for resume, summaries, and language propagation
@QingtaoLi1 QingtaoLi1 merged commit 2265cf2 into main Jun 25, 2026
2 checks passed
HuYaSen pushed a commit that referenced this pull request Jun 26, 2026
…overage (#82)

## Summary

Follow-up to #67. This PR hardens the multilingual decoder/codegen
pipeline by tightening interface completeness checks, generated-artifact
hygiene, C/C++ verification semantics, and planner prompt safety. It
also restores the extracted multilingual regression tests that were
moved out of the original #67 branch for a follow-up PR.

## What changed

### Decoder and verification hardening

- Build C/C++ CMake targets before running `ctest`, so verification does
not falsely pass or fail because test executables were never built.
- Treat C/C++ `make test` targets that only compile objects, without
actually running tests, as verification errors instead of successful
test runs.
- Skip generated/build/cache directories when collecting C/C++ source
files for syntax and verification commands.
- Improve C/C++ prompt rules to:
  - avoid editing build/cache/generated artifacts,
  - use full CMake build + CTest commands,
  - avoid relying on undeclared/transitively included helper functions,
  - report explicit syntax-check summaries.

### Generated artifact hygiene

- Add a shared generated-artifact classifier and prompt rule helper.
- Install local `.git/info/exclude` hygiene rules during batch startup.
- Reject persisted generated artifacts before post-verify and after
verification runs.
- Prevent batch branches containing generated artifacts from being
merged.

### Interface and planner robustness

- Deduplicate repeated whole-file `file_code` blocks in
`interfaces.json` before serialization and before planner prompt
construction.
- Add interface coverage validation so `plan_tasks` fails fast when
`interfaces.json` does not cover all skeleton features.
- Improve Python dependency collection for same-file calls and
`self.method()` invocations.
- Save in-progress interface generation to `interfaces.json.partial` and
only overwrite the canonical `interfaces.json` after successful
completion.
- Allow interface review additions to scaffold missing file entries
under existing feature subtrees.

### Parser and language detection fixes

- Classify header-heavy mixed C/C++ repositories as C++ when C and C++
votes appear together.
- Harden fallback string-literal stripping against catastrophic regex
backtracking on unterminated escaped strings.

### Final validation behavior

- Propagate smoke-test failures into the final validation result instead
of allowing a successful unit-test result to mask a failed smoke check.
- Clarify that `plan --check-only` warning states are not complete/done
states and should not allow downstream stages to proceed.

## Test coverage

This PR adds extensive regression coverage for the multilingual
pipeline, including:

- generated artifact hygiene,
- interface source deduplication,
- skeleton/interface coverage validation,
- multilingual dependency graph behavior,
- multilingual encoder/codegen behavior,
- planner language support and prompt deduplication,
- C/C++/Go/Rust/TypeScript/JavaScript/Python parser behavior,
- decoder language backends and planning phases,
- zero-test guard behavior,
- final test repair,
- repo language resolution,
- orphan/test/build exclusion handling,
- smoke multilingual coverage.

The diff adds 33 new test files and restores the extracted multilingual
tests intended for this follow-up PR.

## Notes for reviewers

- This branch was rebased on top of the squash-merged #67 commit, so the
PR diff should now represent only the follow-up hardening and restored
tests.
- `run_batch` / `post_verify` now update `.git/info/exclude` with local
generated-artifact exclusions. This is intentionally local-only and
non-destructive.
- `plan_tasks` now fails on incomplete interface coverage instead of
silently planning from stale or partial interfaces.
- C/C++ projects whose `make test` target only compiles objects but does
not execute tests will now be rejected as invalid verification results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants