Skip to content

fix(core): route conductor/tasks runtime chatter through logger (#649)#687

Merged
frankbria merged 3 commits into
mainfrom
fix/649-conductor-print-to-logger
Jun 16, 2026
Merged

fix(core): route conductor/tasks runtime chatter through logger (#649)#687
frankbria merged 3 commits into
mainfrom
fix/649-conductor-print-to-logger

Conversation

@frankbria

@frankbria frankbria commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #649. core/conductor.py had 84 bare print() calls (Supervisor / GlobalFix / dependency-analysis chatter) and core/tasks.py had 2 on the LLM-fallback path. Because conductor.py is imported by the FastAPI server (ui/routers/batches_v2.py, tasks_v2.py), these leaked directly to server stdout — violating CLAUDE.md Architecture Rules #1 (core must be headless) and #3 (no UI/progress concepts in core).

Changes

  • conductor.py: all 84 print()logger.{info,warning,error,debug}, mapped by intent:
    • failures / exceptions (✗ Failed, Error executing task, validation failed) → error / warning
    • degraded / fallback (PARTIAL, falling back to serial, timeouts, still blocked/failed) → warning
    • progress / summaries / dependency analysis → info
    • low-value internals (cached decision, previous-status, config reload) → debug
    • Decorative leading \n/spaces stripped; ===/--- separator-only prints dropped.
  • tasks.py: LLM-fallback print() (lines 812, 818) → logger.warning.
  • Tests:
    • New tests/core/test_conductor_logging.py: an AST-based guard asserting zero print() call nodes in conductor.py / tasks.py / runtime.py, plus a caplog behavioral check on the supervisor failure path.
    • Updated 3 existing conductor tests that asserted on captured stdout to assert on caplog instead — same assertion intent, new channel (the exact behavior this issue introduces).

Verified deviations from the issue's literal line list

  • runtime.py:1020,1091 — these print() calls are inside docstring Example: blocks, not executable code. They never run or leak to stdout (AST confirms 0 real call nodes). Converting them would be incorrect, so they're left as-is.
  • core/events.py — uses Rich console.print for CLI-intended output; left as-is per the issue.

No CLI UX regression

cf work batch run keeps its own Rich summary, and events.py still prints batch/task milestones (emit_for_workspace(..., print_event=True) calls are untouched). Only raw per-task chatter moves to logger, where it's properly captured in server logs.

Acceptance criteria

  • No bare print() in conductor.py (and the named tasks.py lines) on server-reachable paths.
  • Runtime chatter goes through logger.

Testing

  • uv run pytest tests/core tests/integration tests/ui2641 passed, 2 skipped.
  • uv run ruff check → clean.

Known limitations

  • Fine-grained per-task chatter now requires logging to be configured to appear on a console (the server captures it via logger; the CLI shows milestones via events.py + its own summary). This is the intended headless-core behavior.

Summary by CodeRabbit

  • Refactor
    • Replaced direct console output with structured logging across batch execution, supervisor auto-resolution, dependency inference, resume/retry flows, and task-generation diagnostics.
    • Improved clarity of progress, state transitions, and failure handling through consistent warning/error/info messages.
  • Tests
    • Updated conductor tests to validate behavior via captured logs instead of stdout.
    • Added coverage to ensure no bare print() calls exist in server-reachable core modules and added a regression test for supervisor classification warning logging.

conductor.py had 84 bare print() calls (Supervisor/GlobalFix/dependency-analysis
chatter) and tasks.py had 2 on the LLM-fallback path. Because conductor.py is
imported by the FastAPI server, these leaked to server stdout — violating
CLAUDE.md headless-core rules (#1, #3).

- conductor.py: all 84 print() -> logger.{info,warning,error,debug} by intent;
  decorative leading newlines/spaces stripped; ===/--- separator-only prints
  dropped.
- tasks.py: LLM-fallback prints -> logger.warning.
- Add tests/core/test_conductor_logging.py: AST guard asserting zero print()
  call nodes in conductor.py/tasks.py/runtime.py, plus a caplog behavioral check.
- Update 3 conductor tests that asserted on captured stdout to assert on caplog
  (same intent, new channel — the very behavior this issue introduces).

Left as-is (verified): runtime.py:1020/1091 are print() inside docstring
Example blocks (never executed); events.py uses Rich console.print for
CLI-intended output. CLI UX is preserved via events.py milestone output
(print_event=True) and the CLI's own Rich batch summary.

Closes #649.
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b3a04ac2-04e2-46be-857d-3b42c5441fc7

📥 Commits

Reviewing files that changed from the base of the PR and between 87c62ea and 8330c96.

📒 Files selected for processing (1)
  • codeframe/core/conductor.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • codeframe/core/conductor.py

Walkthrough

All bare print() calls in codeframe/core/conductor.py (covering supervisor resolution, global fix coordination, dependency analysis, batch resume/retry, serial/parallel execution, gate validation, and subprocess errors) and two in codeframe/core/tasks.py are replaced with structured logger calls at appropriate levels. Three existing conductor tests are updated to assert on caplog instead of capsys, and a new test_conductor_logging.py module enforces no bare print() remains and regression-tests the supervisor classification failure logging path.

Changes

print() → logger Migration in Conductor and Tasks

Layer / File(s) Summary
SupervisorResolver and GlobalFixCoordinator logging
codeframe/core/conductor.py
Replaces print() with logger.debug/info/warning in try_resolve_blocked_task, _classify_with_supervision, and all GlobalFixCoordinator state-transition and timeout paths (lines 154–426).
Dependency analysis and tasks.py fallback logging
codeframe/core/conductor.py, codeframe/core/tasks.py
Replaces print() with logger output for strategy="auto" dependency inference and its serial fallback on error, plus the JSONDecodeError/generic exception paths in generate_from_prd.
Batch resume and retry flow logging
codeframe/core/conductor.py
Replaces print() with logger.info/warning across all batch resume and retry paths: force-mode progress, no-tasks early return, per-task retry progress, resume outcomes, gate validation failure, and retry summaries.
Serial execution path logging
codeframe/core/conductor.py
Replaces print() with logger.info/warning for serial task-start, supervisor retry, completion, blocked, failure, --on-failure=stop transition, gate validation failure, config reload detection, and batch summary.
Parallel execution path logging
codeframe/core/conductor.py
Replaces print() with logger.info/warning for parallel execution plan display, group loop, worker count, per-task outcomes (_execute_single_task), parallel group completion handling, and subprocess error logging.
Test infrastructure updates
tests/core/test_conductor.py
Adds logging import and pytestmark = pytest.mark.v2 to support structured log assertions.
Existing tests: capsys → caplog
tests/core/test_conductor.py
Updates test_start_batch_strategy_parallel_works, test_task_fails_stop, and test_resume_no_failed_tasks to use caplog with the codeframe.core.conductor logger level set, asserting messages via caplog.text.
New logging enforcement and regression tests
tests/core/test_conductor_logging.py
Adds an AST-based _print_call_lines helper, a parametrized test_no_bare_print_on_server_reachable_core test, and a test_supervisor_classification_failure_logs_via_logger regression test.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Poem

🐇 Hop, hop, the prints must go,
Into the logger's tidy flow.
WARNING, INFO, DEBUG bright,
No bare print() left in sight!
The AST checks, the tests all pass —
Structured logs from first to last. 🪵

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(core): route conductor/tasks runtime chatter through logger (#649)' is concise, specific, and clearly describes the main change: replacing print() calls with logger calls in core modules to fix the architectural violation of direct stdout leakage.
Docstring Coverage ✅ Passed Docstring coverage is 90.63% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/649-conductor-print-to-logger

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude

claude Bot commented Jun 16, 2026

Copy link
Copy Markdown

Code Review — fix(core): route conductor/tasks runtime chatter through logger (#649)

Summary: Clean, well-scoped architectural fix. Replacing 84 bare print() calls in conductor.py and 2 in tasks.py with structured logging correctly enforces CLAUDE.md Architecture Rules #1 and #3 (headless core, no UI/progress concepts in core). The AST-based regression guard is a strong addition.


What's done well

  • Log level mapping is accurate. failures/exceptions → error/warning, degraded/fallback → warning, progress summaries → info, low-value internals → debug. No signal lost, noise correctly demoted.
  • AST guard covers runtime.py proactively. Even though runtime.py wasn't changed (its print() calls are in docstring Example: blocks — not executable), including it in the guard prevents future regressions.
  • Tests migrated consistently. Three existing tests properly move from capsyscaplog with scoped log levels (caplog.set_level(logging.INFO/WARNING, logger="codeframe.core.conductor")), preserving original assertion intent.
  • Decorator separators dropped cleanly. Removing \n===\n banners and leading-whitespace padding is correct — structured log lines don't need visual separators.

Minor concerns

1. f-string formatting anti-pattern (minor perf)

# Current (string always formatted, even when log level is disabled)
logger.info(f"Execution plan: {plan.num_groups} groups, {plan.total_tasks} tasks")

# Preferred (lazy evaluation via % args)
logger.info("Execution plan: %d groups, %d tasks", plan.num_groups, plan.total_tasks)

Since conductor.py is on hot paths (batch execution), lazy formatting avoids string allocation when the log level is not active. Not a blocker, but worth a follow-up if the project adopts a logging style convention.

2. Emoji in log messages

logger.warning("⊘ Still blocked")
logger.warning("⚠️  Batch marked PARTIAL due to failed batch-level gates")

Emoji in log output can confuse log aggregators, break grep patterns, and cause encoding issues in some environments. Since the original print() calls had them for terminal UX, and they're now going to structured logs where that UX context is lost, plain ASCII equivalents would be cleaner (e.g. "[BLOCKED] Still blocked", "[PARTIAL] Batch marked PARTIAL..."). Not a blocker — just a flag for the logging style guide.

3. CycleDetectedError logged as bare "Error: {e}"

# conductor.py (CycleDetectedError catch block)
logger.warning(f"Error: {e}")
logger.warning("Falling back to serial execution")

These two could be combined and made more specific:

logger.warning("CycleDetectedError during dependency analysis, falling back to serial: %s", e)

Saves a log entry and makes the error type discoverable without reading the next line.

4. tasks.py logger declaration not visible in diff

The diff shows logger.warning() used in tasks.py at lines 812/818, but doesn't show where logger is declared. If this was a pre-existing module-level logger = logging.getLogger(__name__), the 2641 passing tests confirm it's fine — just noting it for completeness.


Acceptance criteria check

  • No bare print() in conductor.py on server-reachable paths (AST guard confirms)
  • No bare print() in tasks.py LLM-fallback path
  • Runtime chatter goes through logger
  • events.py Rich console.print left intact (CLI-facing output, correct)
  • runtime.py docstring-only print() correctly excluded
  • 2641 tests passing, ruff clean

Verdict: Approve with nits. The architectural fix is correct and complete. The f-string and emoji points are style nits that can be addressed in a follow-up logging style pass. Ready to merge.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
codeframe/core/conductor.py (1)

226-226: Consider using parameterized logging with exc_info=True for exception paths to preserve tracebacks and follow logging best practices.

Logger f-strings are present at multiple locations (lines 226, 374-375, 404-405, 2176). While the current Ruff configuration does not include explicit logging rules, parameterized logging is a best practice for lazy evaluation and proper exception context preservation.

For exception paths specifically (lines 226 and 2176), adding exc_info=True is recommended:

Exception paths that would benefit from exc_info
- logger.warning(f"[Supervisor] Classification failed: {e}")
+ logger.warning("[Supervisor] Classification failed: %s", e, exc_info=True)

- logger.error(f"Error executing task: {e}")
+ logger.error("Error executing task: %s", e, exc_info=True)

For non-exception contexts (lines 374-375, 404-405), parameterized logging improves consistency without requiring exc_info:

- logger.info(f"[GlobalFix] Agent taking ownership: {fix_description[:60]}...")
+ logger.info("[GlobalFix] Agent taking ownership: %s...", fix_description[:60])
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@codeframe/core/conductor.py` at line 226, Replace f-string logging with
parameterized logging in the conductor.py file. For exception handling paths
where an exception variable is available (such as the
`logger.warning(f"[Supervisor] Classification failed: {e}")` pattern), use
parameterized logging by passing the exception as a parameter and including
`exc_info=True` to preserve the full traceback for debugging. For non-exception
logging contexts in the file, use parameterized logging without exc_info for
consistency and to enable proper lazy evaluation of the logger. Change from
logger.method(f"message: {variable}") to logger.method("message: %s", variable)
or logger.method("message: %s", variable, exc_info=True) for exception paths.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@codeframe/core/tasks.py`:
- Around line 812-813: The logger.warning() calls are using f-string
interpolation which violates Ruff's LOG001 logging rule. Convert both
logger.warning() calls that currently use f-string format (such as the one with
"LLM generation failed ({e})") to use parameterized logging format instead,
passing the error variable as a separate argument to the logger method rather
than interpolating it into the string. This matches the parameterized logging
pattern already used elsewhere in the file.

In `@tests/core/test_conductor.py`:
- Around line 3-6: The test module test_conductor.py is missing the module-level
pytest marker required by the coding guidelines. Add the statement pytestmark =
pytest.mark.v2 after the import statements at the top of the file to ensure
these tests are included when running the v2 pytest marker filter with uv run
pytest -m v2.
- Around line 180-182: The assertion on line 181 is weakened by the `or
batch.status == BatchStatus.COMPLETED` condition, which allows the test to pass
even when "Execution plan:" is missing from the logs. Remove the `or
batch.status == BatchStatus.COMPLETED` clause from the assertion so it strictly
checks for the presence of "Execution plan:" in caplog.text. Keep line 182
separate to verify that the batch status is COMPLETED, ensuring the regression
test actually validates the logger-channel behavior as intended.

---

Nitpick comments:
In `@codeframe/core/conductor.py`:
- Line 226: Replace f-string logging with parameterized logging in the
conductor.py file. For exception handling paths where an exception variable is
available (such as the `logger.warning(f"[Supervisor] Classification failed:
{e}")` pattern), use parameterized logging by passing the exception as a
parameter and including `exc_info=True` to preserve the full traceback for
debugging. For non-exception logging contexts in the file, use parameterized
logging without exc_info for consistency and to enable proper lazy evaluation of
the logger. Change from logger.method(f"message: {variable}") to
logger.method("message: %s", variable) or logger.method("message: %s", variable,
exc_info=True) for exception paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b98f02d4-717b-4d11-9a88-6186ba1d01d7

📥 Commits

Reviewing files that changed from the base of the PR and between 461d353 and 6f8a9df.

📒 Files selected for processing (4)
  • codeframe/core/conductor.py
  • codeframe/core/tasks.py
  • tests/core/test_conductor.py
  • tests/core/test_conductor_logging.py

Comment thread codeframe/core/tasks.py
Comment thread tests/core/test_conductor.py
Comment thread tests/core/test_conductor.py Outdated
- Add module-level `pytestmark = pytest.mark.v2` to test_conductor.py per
  CLAUDE.md convention (the file is also auto-marked by tests/core/conftest.py).
- Tighten test_start_batch_strategy_parallel_works: assert the "Execution plan:"
  line is logged unconditionally (drop the weak `or batch.status == COMPLETED`
  escape that let the log-channel assertion pass vacuously).
@frankbria

Copy link
Copy Markdown
Owner Author

Thanks @coderabbitai — triaged all three:

1. tasks.py:813 f-string in logger (Minor) — skipping (codebase convention).
f-string logger calls are the established convention here: core/ has 100 logger.<level>(f"...") lines vs 33 lazy-% ones, and ruff check passes (the flake8-logging-format/G004 rules aren't enabled). The two converted lines match the surrounding code; switching them to %-formatting would be inconsistent.

2. test_conductor.py:6 missing v2 marker (Major) — fixed in 87c62ea. Added module-level pytestmark = pytest.mark.v2 per the CLAUDE.md convention (the directory is also auto-marked by tests/core/conftest.py, but explicit is better).

3. test_conductor.py:182 weak log assertion (Minor) — fixed in 87c62ea. Dropped the or batch.status == COMPLETED escape; "Execution plan:" is logged unconditionally in _execute_parallel, so the assertion is now strict.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

@frankbria Thanks for the thorough triage!

  1. f-string logger convention — Understood, withdrawing that suggestion. With 100 f-string usages vs 33 lazy-% calls and G004 not enforced, consistency with the surrounding code is the right call.
  2. v2 marker — Thanks for adding the explicit pytestmark.
  3. Weak assertion — The strict check is cleaner now that "Execution plan:" is guaranteed to be logged.

🐇 ✨


✏️ Learnings added
Learnt from: frankbria
URL: https://github.com/frankbria/codeframe/pull/687

Timestamp: 2026-06-16T01:12:16.016Z
Learning: In the codeframe repository (`codeframe/core/`), f-string interpolation inside logger calls (e.g., `logger.info(f"...")`) is the established codebase convention — there are ~100 such usages vs ~33 lazy-`%` ones, and the `flake8-logging-format` / `G004` Ruff rule is not enabled. Do not flag f-string logger calls as issues in this repo.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

…view)

Replace the vague two-line `Error: {e}` + `Falling back to serial execution`
with a single specific message naming the dependency cycle, in both auto and
parallel strategy fallback paths.
@frankbria

Copy link
Copy Markdown
Owner Author

Thanks @claude — applied finding #3 in the latest commit: the CycleDetectedError fallback now logs a single specific message (Dependency cycle detected, falling back to serial execution: {e}) in both strategy paths. Skipping #1 (f-string) and #2 (emoji): both match established codebase conventions — core/ uses f-string logger calls 100:33 over lazy-% (ruff clean), and emoji in logs are already used throughout (e.g. server.py startup: 🔒/⚠️/🚦). Staying consistent.

@claude

claude Bot commented Jun 16, 2026

Copy link
Copy Markdown

Claude finished @frankbria's task in 45s —— View job


Review Update

  • Load tools
  • Verify CycleDetectedError fix in conductor.py
  • Confirm triage dispositions and close out review

Verified commit 8330c96. The CycleDetectedError fix looks correct in both strategy paths:

# "auto" strategy path (line 717-719)
except CycleDetectedError as e:
    logger.warning(f"Dependency cycle detected, falling back to serial execution: {e}")
    _execute_serial(workspace, batch, on_event)

# "parallel" strategy path (line 727-729)  
except CycleDetectedError as e:
    logger.warning(f"Dependency cycle detected, falling back to serial execution: {e}")
    _execute_serial(workspace, batch, on_event)

Both catches now emit a single descriptive message that identifies the error type, the action taken, and the cause — down from the two-line bare "Error: {e}" + "Falling back to serial execution" split. The generic Exception catch still uses two lines (717–722), which is fine since that path needs to distinguish "analysis failed" from "falling back."

Triage dispositions confirmed sound:

PR is in good shape. All three review items accounted for, 2641 tests passing, ruff clean.

@claude

claude Bot commented Jun 16, 2026

Copy link
Copy Markdown

Follow-up Review — commits 2 & 3

Good follow-through on the feedback. Two specific items addressed since my first review:

Addressed

  • CycleDetectedError fallback message (my point feat(sprint-4): Multi-Agent Coordination System - P0 & P1 Complete #3): both the auto and parallel strategy paths now emit a single, specific message naming the cycle instead of the two-line Error: {e} + Falling back to serial execution. Exactly what was asked for.
  • Test assertion tightened + pytestmark added (CodeRabbit): the weak or batch.status == COMPLETED escape is removed; the log assertion is now unconditional and will actually catch a regression. pytestmark = pytest.mark.v2 on test_conductor.py brings it in line with the project convention.

Still open (both are style nits, not blockers)

  • f-string formatting: the hot-path logger.info(f"...", ...) calls still allocate strings at every call site regardless of log level. Switching to logger.info("...", arg1, arg2) defers formatting to when the handler actually needs it. Worth a follow-up logging style pass across core.
  • exc_info=True on exception paths (also flagged by CodeRabbit): logger.warning(f"[Supervisor] Classification failed: {e}") and logger.error(f"Error executing task: {e}") swallow the traceback. Adding exc_info=True (or switching to logger.exception()) would let log aggregators show the full stack without changing observable behavior.

Neither item blocks merge. The architectural fix is correct, the test guard is solid, and the substantive feedback from both reviews has been incorporated.

Verdict: Approved.

@claude

claude Bot commented Jun 16, 2026

Copy link
Copy Markdown

Follow-up review (commits 2–3)

The author addressed both CodeRabbit feedback and my prior review in the subsequent commits — no new concerns.

Commit 2 (87c62ea): adds the pytestmark = pytest.mark.v2 marker and tightens the test_start_batch_strategy_parallel_works assertion (drops the vacuous or batch.status == COMPLETED escape that let the log-channel check pass without actually asserting log output). Both are correct improvements.

Commit 3 (8330c96): consolidates the two-line "Error: {e}" + "Falling back to serial execution" into a single descriptive message in both the auto and parallel strategy fallback paths. This is a cleaner fix — one log record vs two, and the consolidated message names the cycle explicitly:

logger.warning(f"Dependency cycle detected, falling back to serial execution: {e}")

Remaining open items from prior review (both non-blocking):

  1. f-string formattinglogger.info(f"...", ...)) still allocates the string even when the log level is disabled. Lazy %-style args (logger.info("Execution plan: %d groups, %d tasks", plan.num_groups, plan.total_tasks)) avoid this. Low priority for this PR scope.

  2. Emoji in log messages⊘ Still blocked, ⚠️ Batch marked PARTIAL etc. remain. These were inherited from the original print() output; consider stripping them or replacing with plain ASCII in a follow-up ([BLOCKED], [PARTIAL]) for log-aggregator compatibility.


Overall: The PR is correct, well-scoped, and tests are solid. The architecture violation is fixed, the AST guard prevents regressions, and the follow-up commits improved test quality and log message clarity. Ready to merge.

@frankbria frankbria merged commit a8edbaa into main Jun 16, 2026
10 checks passed
@frankbria frankbria deleted the fix/649-conductor-print-to-logger branch June 16, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P6.6.2] Replace conductor.py print() calls with logger (headless-core violation)

1 participant