fix: preserve error context across stage / runner / validation paths#15
fix: preserve error context across stage / runner / validation paths#15GrigoryEvko wants to merge 802 commits into
Conversation
short id will generate based on full id when required
…regression feature weights
… and improve docstring clarity
refactor(memory): deduplicate code and delete dead paths
Replace hand-written RetrievalWeights.from_mapping() and CardUpdateDedupConfig.from_mapping() dict parsers (~87 lines) with Pydantic v2 @model_validator(mode="before") — same behavior, idiomatic. Also: add docstrings to all functions in card_update_dedup.py, fix stale test references to deleted _apply_update_actions wrapper, add test for flat config format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
refactor(memory): Pydantic-idiomatic config parsing
- memory_write_example.py → write_pipeline.py (it's production, not an example) - memory_write_config.py → write_pipeline_config.py - selected_ideas_6.py → origin_analysis.py (remove versioned filename) - Delete test_memory_write_example_extended.py (duplicate of test_write_pipeline.py) - Update all 8 import sites + 1 dynamic importlib.import_module call Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
refactor(memory): rename write pipeline and analysis files
Extract from card_conversion.py (554 → 420 lines): - base.py: GigaEvoMemoryBase abstract class (20 lines) - card_search.py: format_search_results, search_cards_by_keyword, synthesize_search_results (115 lines) Update 4 import sites directly (no re-exports). card_conversion.py retains: normalization, conversion, GAM config, constants, protocols. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
refactor(memory): split card_conversion into focused modules
Define MemoryError, MemoryRetrieverError, MemorySearchError, and MemoryStorageError in gigaevo/exceptions.py following the existing GigaEvoError hierarchy. Wire them into the memory subsystem: - gam_search.build() wraps all failures in MemoryRetrieverError - memory.py narrows two gam.build() catches from bare Exception - card_store._load() narrows to (json.JSONDecodeError, OSError) - card_dedup import block narrows to (ImportError, OSError) Resilience-critical catches (search fallback, merge loop, __exit__) remain broad by design. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
refactor(memory): custom exception hierarchy and narrowed catches
…t base to ABC - concept_api.py: all 5 RuntimeError raises → MemoryStorageError (matches gigaevo/database pattern of wrapping I/O errors) - base.py: GigaEvoMemoryBase now uses ABC + @AbstractMethod (matches MutationOperator, Stage, LangGraphAgent pattern) - card_dedup.py: narrow two broad catches: - JSONL read fallback: except Exception → (json.JSONDecodeError, OSError) - GAM store build: except Exception → (MemoryRetrieverError, OSError) - Update 6 test assertions from RuntimeError to MemoryStorageError Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ormity refactor(memory): exception conformity + ABC base class
When write_pipeline.py passes MemoryCard/ProgramCard Pydantic models to memory_platform.save_card(), the dict() call on a Pydantic model doesn't properly flatten nested Pydantic objects like ConnectedIdea. This caused TypeError in _persist_index() when json.dumps() tried to serialize. Root cause: write_pipeline returns list[AnyCard] (Pydantic models) and both backends (memory_platform and memory/shared_memory) consume these cards via save_card(). memory_platform's normalize_memory_card() must explicitly call .model_dump() on Pydantic inputs to flatten nested objects. Fix verified: all 788 memory + integration tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests the exact bug path: Pydantic MemoryCard/ProgramCard with nested ConnectedIdea and MemoryCardExplanation objects must be properly flattened to plain dicts before JSON serialization. 6 tests covering: ProgramCard with ConnectedIdea, MemoryCard with MemoryCardExplanation, plain dict passthrough, JSON round-trips, None. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add gigaevo-memory Git dependency to pyproject.toml - Remove sys.path manipulation from memory_platform/memory.py and remote_gam_retriever.py (no longer needed with proper install) - Simplify test file to use direct imports instead of module mocking Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Expands from 6 to 11 tests covering the complete save_card → _persist_index flow with Pydantic inputs. Tests verify: - normalize_memory_card: ConnectedIdea/MemoryCardExplanation → dict - save_card: Pydantic ProgramCard/MemoryCard → JSON-serializable index - _card_to_backend_content: API payload is clean dict - persist/reload roundtrip: index file survives write→read cycle Uses _make_platform_memory() factory with mocked API client to test memory_platform in isolation without network dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add docstrings to 15 public methods across 5 files (memory.py, concept_api.py, card_dedup.py, openai_inference.py, write_pipeline.py) - Add return type annotations to 4 functions in amem_gam_retriever.py - Fix 2 mypy errors: annotate retrievers dict, rename variable in api_sync.py - Extract magic numbers: _MAX_SUMMARY_CHARS, _MAX_DESCRIPTION_CHARS, _ENTITY_NAME_MAX_LENGTH, _MAX_CONNECTED_DESCRIPTIONS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
refactor(memory): type annotations, docstrings, constants, platform bug fix
Four small fixes for the same anti-pattern: error sites that discard
useful debugging information before logging or re-raising.
ER2 / Optuna and shared optimization utils: ``ExecRunnerError`` rendering
used ``.stderr.rsplit("\n", 1)[-1]`` to grab the last line of the worker's
captured stderr. That dropped the entire traceback above the final
``RuntimeError: ...`` line, which is exactly the part a human needs.
Both ``optimization/utils.py:evaluate_single`` and the optuna
``stage.py`` per-trial path now keep the full stderr in the returned
detail string.
ER4 / Optuna ``_apply_modifications``: ``raise ValueError(f"... {e}")``
without ``from e`` orphaned the original SyntaxError, so logs showed only
the wrapped message. Added ``from e``.
ER7 / ``ValidateCodeStage``: re-raised SyntaxErrors as
``SyntaxError(f"... {e.msg}. Line: `{code}`")`` — a plain message string
with no ``.text``/``.lineno``/``.offset``/``.filename`` set.
``traceback.format_exception`` then couldn't render the caret pointing
at the offending column. Reconstruct via
``SyntaxError(msg, (filename, lineno, offset, text))`` to keep the
structural attributes intact; the original error is still chained.
Test ``test_syntax_error`` updated to assert structural preservation
rather than the discarded prefix string.
ER8 / ``DagScheduler``: two ``logger.error("...: {}", e)`` sites in the
timeout/finished harvest paths logged only the exception's ``str()`` —
no traceback, no chain. Replaced with ``logger.exception(...)`` so
loguru's exception sink renders the full stack.
All 1432 stages + dag tests pass.
Hunter-resolver pass on the original four fixes surfaced two more
instances of the same anti-patterns inside the files this PR already
touches.
`wrapper.py` cloudpickle-loads-failure raise: `raise ExecRunnerError(...)`
inside `except Exception as e` was missing `from e`, orphaning the
underlying UnpicklingError / ImportError / AttributeError that triggered
the failure. Added `from e`.
`dag_runner.py` sibling `logger.error('...: {}', e)` sites — the
original PR fixed 2 of 11 instances in the scheduler loop; the other
9 followed the same anti-pattern in the same control flow (fetch-by-status
failure, orphan-fetch failure, mget-for-launch failure, DAG-build failure
incl. an open-coded `logger.error('Traceback:\n{}', traceback.format_exc())`
two-liner, state-update failure during build-error discard, batch
mark-started failure, DAG run failure, post-run state-transition failure,
and batch RUNNING→DONE flush failure). All converted to
`logger.exception(...)`; redundant `import traceback` + format_exc
line removed.
Test strengthening for `test_syntax_error`: also asserts
`exc.__cause__ is SyntaxError` and that `traceback.format_exception`
output contains both the offending source line and a caret `^` —
which is the whole point of preserving SyntaxError's structural
attributes.
1432 stages + dag tests pass; ruff clean.
The 1-line modifier added in 9882855 is in the legacy ``WorkerPool`` path of ``wrapper.py``. PR FusionBrainLab#14 (loky-executor) rewrites the entire module — the ``run_exec_runner`` function this raise statement lives in is replaced by ``LokyBackend.execute``, and the cloudpickle-loads site moves to ``_load_spill``. Keeping the modifier here forces a merge conflict with FusionBrainLab#14 regardless of order; removing it lets either PR merge independently against ``main``. The equivalent ``from e`` is worth re-adding inside the post-FusionBrainLab#14 ``LokyBackend`` cloudpickle path as a follow-up after either PR merges.
|
Conflict map vs PR #10 ( After the pre-emptive revert on this branch (commit vs PR #14: none. vs PR #10:
Resolution (whichever PR merges second): For the 11 For the other 3 files: take the union — keep this PR's structural fixes (full stderr instead of new_err = SyntaxError(
sanitize_for_log(e.msg or "syntax error"),
(e.filename, e.lineno, e.offset, e.text),
)
raise new_err from eThe conflict is intrinsic — same lines edited for orthogonal reasons (error-context preservation vs control-byte sanitisation). It is not eliminable at the branch level without gutting most of this PR's scope. Merge order is interchangeable; the second-merged PR needs ~15 line-level unions as documented above. |
… designs flush.py: - Add kill_run_writers() — kills parent run.py processes writing to target DBs before flushing. Previously only exec_runner children were killed, leaving the parent alive to immediately repopulate a flushed DB. - Brief 2s sleep after kills to let processes release Redis connections. preflight_check.py (check #9): - When a DB is non-empty, scan for live run.py writers and report their PIDs in the failure message so the fix is obvious. preflight_check.py (check #15): - Add factorial_design: true support in experiment.yaml. When declared, the >2-cells check is skipped (passes with a note). Without the flag, the failure message now suggests adding it. - Add helper _find_run_pids_for_db() used by check #9. experiment.yaml (heilbron/adversarial-dynamic-updates): - Add factorial_design: true to silence the false-positive MAJOR. tests: 20 tests (12 existing + 5 check15 + 3 check9), all pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Four small fixes for the same anti-pattern: error sites that discard useful debugging information before logging or re-raising. All independent, no cross-file coupling.
optimization/utils.py:evaluate_singleand the Optuna per-trial path atoptimization/optuna/stage.pyboth renderedExecRunnerErrorfailures via.stderr.rsplit("\n", 1)[-1]to grab the last line of the worker's captured stderr. That drops the entire traceback above the finalRuntimeError: ...summary line — exactly the part a human reading the log needs. Both sites now keep the full stderr in the returned detail string.optimization/optuna/stage.py:_apply_modificationsre-raised the innerSyntaxErrorasraise ValueError(f"Parameterized code syntax error: {e}")withoutfrom e, orphaning__cause__. Logs showed only the wrapped message. Addingfrom erestores the chain.ValidateCodeStagere-raised the compile-timeSyntaxErroras a plain message string —SyntaxError(f"SyntaxError at line {e.lineno}, offset {e.offset}: {e.msg}. Line: ...")— with no.text/.lineno/.offset/.filenameset on the new instance.traceback.format_exceptionthen can't render the caret pointing at the offending column. Reconstructing viaSyntaxError(msg, (filename, lineno, offset, text))keeps the structural attributes intact; the original error is still chained. Existingtest_syntax_errorupdated to assert structural preservation (exc.lineno,exc.text) rather than the discarded prefix string.DagScheduler's timeout and finished-task harvest paths usedlogger.error("...: {}", e)— only the exception'sstr()was emitted, no traceback and no__cause__chain. Replaced withlogger.exception(...)so loguru's exception sink renders the full stack.