feat(tasks): richer task generation with recursive decomposition (#420)#444
Conversation
Add parent_id, lineage, is_leaf, and hierarchical_id fields to support hierarchical task decomposition. Includes DB schema, migrations, and CRUD operations with full backward compatibility.
Update the LLM prompt to request complexity, estimated_hours, uncertainty, depends_on_titles, and files_to_modify. Parse and validate these fields with clamping/defaults. Resolve title-based dependencies to task IDs post-creation.
…ommand (#420 steps 5-6) - Add --recursive/-r and --max-depth options to `cf tasks generate` for recursive task decomposition via generate_task_tree/flatten_task_tree - Add `cf tasks tree` command to display task hierarchy as ASCII tree - Add lineage context to TaskContextPackager.build() prompt when task has lineage data, appearing before verification gates - 11 new tests: 7 CLI tests + 4 context packager lineage tests
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds recursive task decomposition and tree management via a new task_tree module, extends Task schema with tree fields, integrates lineage into prompts, adds CLI flags for recursive generation and a tasks_tree command, and includes schema migrations plus comprehensive tests for tree generation, flattening, display, and propagation. Changes
Sequence DiagramssequenceDiagram
participant CLI as "tasks_generate CLI"
participant TreeGen as "generate_task_tree"
participant Classify as "classify_task"
participant Provider as "LLM Provider"
participant Decompose as "decompose_task"
participant Flatten as "flatten_task_tree"
participant Workspace as "Workspace DB"
CLI->>TreeGen: Invoke with PRD description, lineage, max_depth
TreeGen->>Classify: Ask: atomic or composite? (includes lineage)
Classify->>Provider: LLM prompt (with lineage)
Provider-->>Classify: "atomic" / "composite"
alt atomic or depth limit reached
Classify-->>TreeGen: Return leaf node
else composite
TreeGen->>Decompose: Request 2–7 subtasks (with lineage)
Decompose->>Provider: LLM prompt (with lineage)
Provider-->>Decompose: Subtasks (JSON / markdown)
Decompose-->>TreeGen: Parsed subtasks
loop per subtask
TreeGen->>TreeGen: Recurse (lineage + parent, depth+1)
end
end
TreeGen->>Flatten: Return nested tree to flatten
Flatten->>Workspace: Create task rows (parent_id, lineage, hierarchical_id)
Workspace-->>Flatten: Persisted task IDs
Flatten-->>CLI: Completed
sequenceDiagram
participant CLI as "tasks_tree CLI"
participant Workspace as "Workspace DB"
participant Display as "display_task_tree"
participant Render as "_render_node"
participant Console as "Console Output"
CLI->>Display: Request tree for workspace
Display->>Workspace: Load all tasks
Workspace-->>Display: Return task list
Display->>Display: Build parent→children map, sort roots
loop per root
Display->>Render: Render node line (status, kind, hierarchical id)
Render-->>Console: Emit line
loop per child
Render->>Render: Recurse with indent
Render-->>Console: Emit child line
end
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
codeframe/cli/app.py (1)
1597-1616:⚠️ Potential issue | 🟠 MajorHandle
--recursiveand--no-llmas mutually exclusive.Line 1597 skips LLM validation when
--no-llmis set, but Line 1601 still executes the LLM-only recursive path.cf tasks generate --recursive --no-llmcurrently has contradictory behavior and can fail with a confusing provider error path.💡 Suggested fix
- if not no_llm: + if recursive and no_llm: + console.print("[red]Error:[/red] --recursive cannot be combined with --no-llm") + raise typer.Exit(1) + + if not no_llm: from codeframe.cli.validators import require_anthropic_api_key require_anthropic_api_key()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/cli/app.py` around lines 1597 - 1616, The recursive branch currently runs LLM-only logic even when --no-llm is passed; make the flags mutually exclusive by validating them before using LLM functions: add a check that if recursive and no_llm are both true, print an error (or raise/exit) and abort, or change the recursive condition to "if recursive and not no_llm" so the code only calls get_provider(), generate_task_tree, and flatten_task_tree when an LLM is allowed; also ensure require_anthropic_api_key() is only called when an LLM will actually be used (i.e., when not no_llm).
🧹 Nitpick comments (1)
tests/cli/test_tasks_tree_cli.py (1)
71-77: Assertresult.exit_codein the happy-path CLI tests.Right now both tests can pass even if the command crashes after calling the mocked function. An exit-code assertion will make the behavior check meaningful and prevent false positives.
Proposed fix
result = runner.invoke( app, ["tasks", "generate", "--recursive", "-w", str(tmp_path)], ) + assert result.exit_code == 0, result.output mock_gen_tree.assert_called_once() mock_flatten.assert_called_once() @@ result = runner.invoke( app, ["tasks", "generate", "-w", str(tmp_path)], ) + assert result.exit_code == 0, result.output mock_gen.assert_called_once()Also applies to: 104-109
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/cli/test_tasks_tree_cli.py` around lines 71 - 77, Add an explicit exit-code assertion after the CLI invocation to ensure the command completed successfully: after the runner.invoke(app, ["tasks", "generate", "--recursive", "-w", str(tmp_path)]) call, assert result.exit_code == 0 (do the same for the other similar test later in the file that also calls runner.invoke and asserts mock_gen_tree/mock_flatten). This ensures the test fails if the CLI crashes even when the mocks were called.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@codeframe/core/task_tree.py`:
- Around line 26-29: The recursive decomposition currently only requests and
persists 'title'/'description' (DECOMPOSE_SYSTEM_PROMPT and the code paths that
parse/save decomposed children), losing richer task metadata; update the
decomposition prompt (DECOMPOSE_SYSTEM_PROMPT) to require the full task schema
(complexity, estimate, dependencies, files, acceptance_criteria, etc.) and
modify the recursive parse/save logic (the functions that parse the
decomposition response and the code that creates/persists child nodes in
TaskTree) to accept and persist all those fields instead of only
title/description/tree fields so downstream features receive full metadata.
- Around line 396-400: The parent status update in propagate_status mutates
state via task_module.update_status(parent.id) but doesn't emit the required
audit event; after the successful change where new_status != parent.status, call
the core events API (import core.events) to emit a status-transition event
(e.g., core.events.emit_status_transition or the project's equivalent) with the
task id, old status (parent.status), new_status and any workspace/context so all
auto-propagated transitions are recorded; ensure the emit happens only when
update_status is invoked and include the same identifiers used in
propagate_status and task_module.update_status.
- Around line 305-308: The sort key currently treats hierarchical_id as a string
so "1.10" sorts before "1.2"; change the key used when sorting children_map
entries so it converts t.hierarchical_id into a tuple of integers (e.g.,
tuple(map(int, t.hierarchical_id.split('.')))) and fall back to t.title or a
safe tuple when hierarchical_id is None or contains non-numeric parts; update
the sort call on children_map (the lambda at the children_map[pid].sort(...)
line) to parse and return the numeric tuple for proper numeric hierarchical
ordering.
In `@codeframe/core/tasks.py`:
- Around line 559-561: The task creation currently only persists
complexity_score, estimated_hours, and uncertainty_level from task_data and
flattens/loses other structured metadata; update the Task construction (where
complexity_score, estimated_hours, uncertainty_level are set) to also assign
files_to_modify=task_data.get("files_to_modify"),
files_to_create=task_data.get("files_to_create"), and
acceptance_criteria=task_data.get("acceptance_criteria") instead of folding them
into description, and apply the same fix to the other parsing/creation sites
referenced (the blocks around lines 592-599 and 643-655) so the structured
metadata survives parsing and downstream callers can consume those fields.
- Around line 631-637: The conversions for task["complexity"] and
task["estimated_hours"] are unsafe for untrusted LLM output; wrap the int() and
float() conversions in try/except (catch ValueError and TypeError), clamp valid
parsed values as before (complexity -> max(1,min(5,int(...))), estimated_hours
-> max(0.1,float(...))) and on parse failure log a warning and set the field to
None or a safe default so generate_from_prd() does not treat a single malformed
value as a hard failure; update the code around the task dict handling where
complexity and estimated_hours are parsed to implement this defensive parsing
and logging.
- Around line 565-571: The current resolution loop silently drops dependency
titles that don't exactly match and never refreshes created_tasks with the
updated objects returned by update_depends_on; change the logic in the block
that builds title_to_id and iterates over tasks_data/created_tasks so that it
(1) detects and surfaces missing or duplicate titles from
tasks_data.get("depends_on_titles") (raise an exception or return a clear
error/log instead of silently skipping) and (2) assigns the result of
update_depends_on(workspace, task.id, dep_ids) back into created_tasks (i.e.,
replace the task in created_tasks with the returned Task) so callers receive
up-to-date depends_on values; retain references to title_to_id, created_tasks,
update_depends_on, tasks_data, and depends_on_titles when making these changes.
In `@codeframe/core/workspace.py`:
- Around line 424-435: The upgrade path in _ensure_schema_upgrades() must add
the github_issue_number column so tasks.py's unconditional SELECT won't fail;
add a check like the existing ones for "github_issue_number" on the tasks table
and run an ALTER TABLE tasks ADD COLUMN github_issue_number TEXT (or INTEGER)
DEFAULT NULL followed by conn.commit(); update the block that checks
task_columns (in codeframe/core/workspace.py, the function
_ensure_schema_upgrades) to include this ALTER and ensure the schema is present
before any reads from codeframe/core/tasks.py.
In `@tests/core/test_context_packager.py`:
- Around line 465-542: The module tests/core/test_context_packager.py contains
new v2 lineage tests (e.g. TestLineageContext and its methods like
test_context_packager_includes_lineage) but lacks the module-level v2 marker;
add an import for pytest and set pytestmark = pytest.mark.v2 at the top of the
file so all tests in this module are marked v2 per the repo testing rules.
In `@tests/core/test_rich_task_generation.py`:
- Around line 10-13: Remove the unused import PrdRecord from the test file to
satisfy the linter: edit the imports at the top of
tests/core/test_rich_task_generation.py and delete the reference to PrdRecord so
only used symbols (MockProvider, tasks, create_or_load_workspace) remain; ensure
no other references to PrdRecord exist in the file before committing.
---
Outside diff comments:
In `@codeframe/cli/app.py`:
- Around line 1597-1616: The recursive branch currently runs LLM-only logic even
when --no-llm is passed; make the flags mutually exclusive by validating them
before using LLM functions: add a check that if recursive and no_llm are both
true, print an error (or raise/exit) and abort, or change the recursive
condition to "if recursive and not no_llm" so the code only calls
get_provider(), generate_task_tree, and flatten_task_tree when an LLM is
allowed; also ensure require_anthropic_api_key() is only called when an LLM will
actually be used (i.e., when not no_llm).
---
Nitpick comments:
In `@tests/cli/test_tasks_tree_cli.py`:
- Around line 71-77: Add an explicit exit-code assertion after the CLI
invocation to ensure the command completed successfully: after the
runner.invoke(app, ["tasks", "generate", "--recursive", "-w", str(tmp_path)])
call, assert result.exit_code == 0 (do the same for the other similar test later
in the file that also calls runner.invoke and asserts
mock_gen_tree/mock_flatten). This ensures the test fails if the CLI crashes even
when the mocks were called.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0c296c73-8567-4445-9b9a-c4367ed4656a
📒 Files selected for processing (10)
codeframe/cli/app.pycodeframe/core/context_packager.pycodeframe/core/task_tree.pycodeframe/core/tasks.pycodeframe/core/workspace.pytests/cli/test_tasks_tree_cli.pytests/core/test_context_packager.pytests/core/test_rich_task_generation.pytests/core/test_task_tree.pytests/core/test_task_tree_schema.py
| DECOMPOSE_SYSTEM_PROMPT = ( | ||
| "Break this task into 2-7 concrete subtasks. Each should be actionable and " | ||
| "testable. Return a JSON array of objects with 'title' and 'description' fields." | ||
| ) |
There was a problem hiding this comment.
Recursive flow currently drops rich task metadata.
The recursive decomposition contract only captures title/description, and Line 254-263 persists only those plus tree fields. That means recursive-generated tasks miss richer generation data (e.g., complexity/estimate/dependencies/files/acceptance criteria), so downstream features won’t get populated in this path.
Also applies to: 121-160, 254-263
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/task_tree.py` around lines 26 - 29, The recursive
decomposition currently only requests and persists 'title'/'description'
(DECOMPOSE_SYSTEM_PROMPT and the code paths that parse/save decomposed
children), losing richer task metadata; update the decomposition prompt
(DECOMPOSE_SYSTEM_PROMPT) to require the full task schema (complexity, estimate,
dependencies, files, acceptance_criteria, etc.) and modify the recursive
parse/save logic (the functions that parse the decomposition response and the
code that creates/persists child nodes in TaskTree) to accept and persist all
those fields instead of only title/description/tree fields so downstream
features receive full metadata.
| for pid in children_map: | ||
| children_map[pid].sort( | ||
| key=lambda t: t.hierarchical_id or t.title | ||
| ) |
There was a problem hiding this comment.
Sort key for hierarchical_id is lexicographic, not numeric.
Line 305-308 will order 1.10 before 1.2. This breaks tree display ordering once siblings exceed 9.
💡 Suggested fix
+def _hierarchical_sort_key(task) -> tuple:
+ hid = task.hierarchical_id
+ if hid:
+ parts = []
+ for p in str(hid).split("."):
+ parts.append(int(p) if p.isdigit() else p)
+ return (0, parts)
+ return (1, task.title.lower())
+
# Sort children by hierarchical_id if available, else by title
for pid in children_map:
- children_map[pid].sort(
- key=lambda t: t.hierarchical_id or t.title
- )
+ children_map[pid].sort(key=_hierarchical_sort_key)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for pid in children_map: | |
| children_map[pid].sort( | |
| key=lambda t: t.hierarchical_id or t.title | |
| ) | |
| def _hierarchical_sort_key(task) -> tuple: | |
| hid = task.hierarchical_id | |
| if hid: | |
| parts = [] | |
| for p in str(hid).split("."): | |
| parts.append(int(p) if p.isdigit() else p) | |
| return (0, parts) | |
| return (1, task.title.lower()) | |
| # Sort children by hierarchical_id if available, else by title | |
| for pid in children_map: | |
| children_map[pid].sort(key=_hierarchical_sort_key) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/task_tree.py` around lines 305 - 308, The sort key currently
treats hierarchical_id as a string so "1.10" sorts before "1.2"; change the key
used when sorting children_map entries so it converts t.hierarchical_id into a
tuple of integers (e.g., tuple(map(int, t.hierarchical_id.split('.')))) and fall
back to t.title or a safe tuple when hierarchical_id is None or contains
non-numeric parts; update the sort call on children_map (the lambda at the
children_map[pid].sort(...) line) to parse and return the numeric tuple for
proper numeric hierarchical ordering.
| if new_status and new_status != parent.status: | ||
| task_module.update_status(workspace, parent.id, new_status) | ||
|
|
||
| # Recursively propagate upward | ||
| propagate_status(workspace, parent.id) |
There was a problem hiding this comment.
Emit events when propagated parent status changes.
Line 397 mutates parent task state but does not emit a status-transition event, so audit/observability is incomplete for auto-propagated transitions.
💡 Suggested fix
+from codeframe.core.events import emit_for_workspace, EventType
@@
if new_status and new_status != parent.status:
+ old_status = parent.status
task_module.update_status(workspace, parent.id, new_status)
+ emit_for_workspace(
+ workspace,
+ EventType.TASK_STATUS_CHANGED,
+ {
+ "task_id": parent.id,
+ "old_status": old_status.value,
+ "new_status": new_status.value,
+ },
+ print_event=False,
+ )As per coding guidelines codeframe/core/**/*.py: “All core modules must emit events for state transitions via core/events.py for audit and observability”.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if new_status and new_status != parent.status: | |
| task_module.update_status(workspace, parent.id, new_status) | |
| # Recursively propagate upward | |
| propagate_status(workspace, parent.id) | |
| from codeframe.core.events import emit_for_workspace, EventType | |
| if new_status and new_status != parent.status: | |
| old_status = parent.status | |
| task_module.update_status(workspace, parent.id, new_status) | |
| emit_for_workspace( | |
| workspace, | |
| EventType.TASK_STATUS_CHANGED, | |
| { | |
| "task_id": parent.id, | |
| "old_status": old_status.value, | |
| "new_status": new_status.value, | |
| }, | |
| print_event=False, | |
| ) | |
| # Recursively propagate upward | |
| propagate_status(workspace, parent.id) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/task_tree.py` around lines 396 - 400, The parent status update
in propagate_status mutates state via task_module.update_status(parent.id) but
doesn't emit the required audit event; after the successful change where
new_status != parent.status, call the core events API (import core.events) to
emit a status-transition event (e.g., core.events.emit_status_transition or the
project's equivalent) with the task id, old status (parent.status), new_status
and any workspace/context so all auto-propagated transitions are recorded;
ensure the emit happens only when update_status is invoked and include the same
identifiers used in propagate_status and task_module.update_status.
| complexity_score=task_data.get("complexity"), | ||
| estimated_hours=task_data.get("estimated_hours"), | ||
| uncertainty_level=task_data.get("uncertainty"), |
There was a problem hiding this comment.
The richer metadata contract is still lossy.
This flow only persists complexity/hours/uncertainty. files_to_modify gets flattened into description, and files_to_create / acceptance_criteria never survive parsing at all, so downstream callers still cannot consume the structured metadata this PR is supposed to add.
Also applies to: 592-599, 643-655
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/tasks.py` around lines 559 - 561, The task creation currently
only persists complexity_score, estimated_hours, and uncertainty_level from
task_data and flattens/loses other structured metadata; update the Task
construction (where complexity_score, estimated_hours, uncertainty_level are
set) to also assign files_to_modify=task_data.get("files_to_modify"),
files_to_create=task_data.get("files_to_create"), and
acceptance_criteria=task_data.get("acceptance_criteria") instead of folding them
into description, and apply the same fix to the other parsing/creation sites
referenced (the blocks around lines 592-599 and 643-655) so the structured
metadata survives parsing and downstream callers can consume those fields.
| # Resolve title-based dependencies to task IDs | ||
| title_to_id = {t.title: t.id for t in created_tasks} | ||
| for task_data, task in zip(tasks_data, created_tasks): | ||
| dep_titles = task_data.get("depends_on_titles", []) | ||
| dep_ids = [title_to_id[t] for t in dep_titles if t in title_to_id] | ||
| if dep_ids: | ||
| update_depends_on(workspace, task.id, dep_ids) |
There was a problem hiding this comment.
Dependency resolution should not silently weaken the graph or return stale tasks.
A typo, truncation mismatch, or duplicate title just disappears here because only exact matches survive. Also, update_depends_on() returns the updated Task, but created_tasks is never refreshed, so callers get stale depends_on values back even when resolution succeeds.
Proposed fix
- title_to_id = {t.title: t.id for t in created_tasks}
- for task_data, task in zip(tasks_data, created_tasks):
+ title_to_id = {t.title: t.id for t in created_tasks}
+ if len(title_to_id) != len(created_tasks):
+ raise ValueError("Duplicate task titles prevent dependency resolution")
+
+ for i, (task_data, task) in enumerate(zip(tasks_data, created_tasks)):
dep_titles = task_data.get("depends_on_titles", [])
- dep_ids = [title_to_id[t] for t in dep_titles if t in title_to_id]
+ unresolved = [title for title in dep_titles if title not in title_to_id]
+ if unresolved:
+ raise ValueError(
+ f"Unresolved dependencies for {task.title}: {', '.join(unresolved)}"
+ )
+
+ dep_ids = [title_to_id[title] for title in dep_titles]
if dep_ids:
- update_depends_on(workspace, task.id, dep_ids)
+ created_tasks[i] = update_depends_on(workspace, task.id, dep_ids)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/tasks.py` around lines 565 - 571, The current resolution loop
silently drops dependency titles that don't exactly match and never refreshes
created_tasks with the updated objects returned by update_depends_on; change the
logic in the block that builds title_to_id and iterates over
tasks_data/created_tasks so that it (1) detects and surfaces missing or
duplicate titles from tasks_data.get("depends_on_titles") (raise an exception or
return a clear error/log instead of silently skipping) and (2) assigns the
result of update_depends_on(workspace, task.id, dep_ids) back into created_tasks
(i.e., replace the task in created_tasks with the returned Task) so callers
receive up-to-date depends_on values; retain references to title_to_id,
created_tasks, update_depends_on, tasks_data, and depends_on_titles when making
these changes.
| complexity = task.get("complexity") | ||
| if complexity is not None: | ||
| complexity = max(1, min(5, int(complexity))) | ||
|
|
||
| estimated_hours = task.get("estimated_hours") | ||
| if estimated_hours is not None: | ||
| estimated_hours = max(0.1, float(estimated_hours)) |
There was a problem hiding this comment.
Handle malformed numeric metadata defensively.
These values come from untrusted LLM output. int() / float() can raise here, and generate_from_prd() currently treats ValueError as a hard config failure, so one bad "complexity" or "estimated_hours" field can abort the entire generation path instead of degrading gracefully.
Proposed fix
- complexity = task.get("complexity")
- if complexity is not None:
- complexity = max(1, min(5, int(complexity)))
+ complexity = task.get("complexity")
+ try:
+ complexity = (
+ max(1, min(5, int(complexity)))
+ if complexity is not None
+ else None
+ )
+ except (TypeError, ValueError):
+ complexity = None
- estimated_hours = task.get("estimated_hours")
- if estimated_hours is not None:
- estimated_hours = max(0.1, float(estimated_hours))
+ estimated_hours = task.get("estimated_hours")
+ try:
+ estimated_hours = (
+ max(0.1, float(estimated_hours))
+ if estimated_hours is not None
+ else None
+ )
+ except (TypeError, ValueError):
+ estimated_hours = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/tasks.py` around lines 631 - 637, The conversions for
task["complexity"] and task["estimated_hours"] are unsafe for untrusted LLM
output; wrap the int() and float() conversions in try/except (catch ValueError
and TypeError), clamp valid parsed values as before (complexity ->
max(1,min(5,int(...))), estimated_hours -> max(0.1,float(...))) and on parse
failure log a warning and set the field to None or a safe default so
generate_from_prd() does not treat a single malformed value as a hard failure;
update the code around the task dict handling where complexity and
estimated_hours are parsed to implement this defensive parsing and logging.
| if "parent_id" not in task_columns: | ||
| cursor.execute("ALTER TABLE tasks ADD COLUMN parent_id TEXT") | ||
| conn.commit() | ||
| if "lineage" not in task_columns: | ||
| cursor.execute("ALTER TABLE tasks ADD COLUMN lineage TEXT DEFAULT '[]'") | ||
| conn.commit() | ||
| if "is_leaf" not in task_columns: | ||
| cursor.execute("ALTER TABLE tasks ADD COLUMN is_leaf INTEGER DEFAULT 1") | ||
| conn.commit() | ||
| if "hierarchical_id" not in task_columns: | ||
| cursor.execute("ALTER TABLE tasks ADD COLUMN hierarchical_id TEXT") | ||
| conn.commit() |
There was a problem hiding this comment.
Backfill github_issue_number in the upgrade path before shipping this schema change.
codeframe/core/tasks.py now selects github_issue_number unconditionally, but _ensure_schema_upgrades() still never adds that column for pre-existing workspaces. On an older DB, the first task read will fail with sqlite3.OperationalError: no such column: github_issue_number.
Proposed fix
if "uncertainty_level" not in task_columns:
cursor.execute("ALTER TABLE tasks ADD COLUMN uncertainty_level TEXT")
conn.commit()
+ if "github_issue_number" not in task_columns:
+ cursor.execute("ALTER TABLE tasks ADD COLUMN github_issue_number INTEGER")
+ conn.commit()
if "parent_id" not in task_columns:
cursor.execute("ALTER TABLE tasks ADD COLUMN parent_id TEXT")
conn.commit()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/workspace.py` around lines 424 - 435, The upgrade path in
_ensure_schema_upgrades() must add the github_issue_number column so tasks.py's
unconditional SELECT won't fail; add a check like the existing ones for
"github_issue_number" on the tasks table and run an ALTER TABLE tasks ADD COLUMN
github_issue_number TEXT (or INTEGER) DEFAULT NULL followed by conn.commit();
update the block that checks task_columns (in codeframe/core/workspace.py, the
function _ensure_schema_upgrades) to include this ALTER and ensure the schema is
present before any reads from codeframe/core/tasks.py.
| class TestLineageContext: | ||
| """Tests for lineage inclusion in build() prompt.""" | ||
|
|
||
| def test_context_packager_includes_lineage(self, mock_workspace): | ||
| """Task with lineage should have 'Task Lineage' section in prompt.""" | ||
| ctx = MagicMock(spec=TaskContext) | ||
| ctx.task = MagicMock() | ||
| ctx.task.lineage = ["Build app", "Authentication module"] | ||
| ctx.to_prompt_context.return_value = ( | ||
| "## Task\n**Title:** Implement JWT\n**Description:** Add tokens\n" | ||
| ) | ||
| ctx.relevant_files = [] | ||
|
|
||
| with patch("codeframe.core.context_packager.ContextLoader") as MockLoader: | ||
| MockLoader.return_value.load.return_value = ctx | ||
|
|
||
| packager = TaskContextPackager(mock_workspace) | ||
| result = packager.build("task-1") | ||
|
|
||
| assert "Task Lineage" in result.prompt | ||
| assert "Build app" in result.prompt | ||
| assert "Authentication module" in result.prompt | ||
|
|
||
| def test_context_packager_no_lineage(self, mock_workspace): | ||
| """Task without lineage should not have 'Task Lineage' section.""" | ||
| ctx = MagicMock(spec=TaskContext) | ||
| ctx.task = MagicMock() | ||
| ctx.task.lineage = [] | ||
| ctx.to_prompt_context.return_value = ( | ||
| "## Task\n**Title:** Simple task\n**Description:** Do it\n" | ||
| ) | ||
| ctx.relevant_files = [] | ||
|
|
||
| with patch("codeframe.core.context_packager.ContextLoader") as MockLoader: | ||
| MockLoader.return_value.load.return_value = ctx | ||
|
|
||
| packager = TaskContextPackager(mock_workspace) | ||
| result = packager.build("task-1") | ||
|
|
||
| assert "Task Lineage" not in result.prompt | ||
|
|
||
| def test_context_packager_lineage_missing_attribute(self, mock_workspace): | ||
| """Task without lineage attribute should not have 'Task Lineage' section.""" | ||
| ctx = MagicMock(spec=TaskContext) | ||
| ctx.task = MagicMock(spec=["title", "description", "id"]) | ||
| # No lineage attribute on task | ||
| ctx.to_prompt_context.return_value = ( | ||
| "## Task\n**Title:** Old task\n**Description:** Legacy\n" | ||
| ) | ||
| ctx.relevant_files = [] | ||
|
|
||
| with patch("codeframe.core.context_packager.ContextLoader") as MockLoader: | ||
| MockLoader.return_value.load.return_value = ctx | ||
|
|
||
| packager = TaskContextPackager(mock_workspace) | ||
| result = packager.build("task-1") | ||
|
|
||
| assert "Task Lineage" not in result.prompt | ||
|
|
||
| def test_lineage_appears_before_gates(self, mock_workspace): | ||
| """Lineage section should appear before Verification Gates.""" | ||
| ctx = MagicMock(spec=TaskContext) | ||
| ctx.task = MagicMock() | ||
| ctx.task.lineage = ["Parent task"] | ||
| ctx.to_prompt_context.return_value = ( | ||
| "## Task\n**Title:** Child task\n" | ||
| ) | ||
| ctx.relevant_files = [] | ||
|
|
||
| with patch("codeframe.core.context_packager.ContextLoader") as MockLoader: | ||
| MockLoader.return_value.load.return_value = ctx | ||
|
|
||
| packager = TaskContextPackager(mock_workspace) | ||
| result = packager.build("task-1") | ||
|
|
||
| lineage_pos = result.prompt.index("Task Lineage") | ||
| gates_pos = result.prompt.index("Verification Gates") | ||
| assert lineage_pos < gates_pos |
There was a problem hiding this comment.
Add the missing v2 marker for this module.
These new lineage tests cover v2 functionality, but the file still lacks a module-level pytestmark = pytest.mark.v2, so they won't follow the repo's test-selection rules.
Proposed fix
import pytest
from pathlib import Path
from unittest.mock import MagicMock, patch
+pytestmark = pytest.mark.v2
+
from codeframe.core.context_packager import TaskContextPackager, PackagedContext
from codeframe.core.context import TaskContext, FileInfo
from codeframe.core.adapters.agent_adapter import AgentContext🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/core/test_context_packager.py` around lines 465 - 542, The module
tests/core/test_context_packager.py contains new v2 lineage tests (e.g.
TestLineageContext and its methods like test_context_packager_includes_lineage)
but lacks the module-level v2 marker; add an import for pytest and set
pytestmark = pytest.mark.v2 at the top of the file so all tests in this module
are marked v2 per the repo testing rules.
|
PR Review: feat(tasks): richer task generation with recursive decomposition This is a substantial and well-structured PR. The module separation is clean (task_tree.py is properly headless), test coverage is solid (51 tests), and the schema migration approach is consistent with the existing pattern. A few critical issues to address before merging. Critical Issues 1. generate_task_tree receives the full PRD as description, not a task title In cli/app.py, the recursive path calls generate_task_tree(provider, prd_record.content, lineage=[], depth=0, max_depth=max_depth). The function calls classify_task on this string, asking the LLM 'is this atomic or composite?' - but it is a full PRD, not a task description. The LLM will almost certainly classify it as composite and decompose_task will try to break the entire PRD into 2-7 subtasks, which is not the intended behavior. The correct flow: first extract top-level tasks from the PRD (as _generate_tasks_with_llm does today), then recursively decompose each task. Without this fix, the recursive path produces a fundamentally different structure than what the feature description implies. 2. propagate_status is defined but never called The PR description says 'Children done, parent auto-completes' but propagate_status is not wired into runtime.py or state_machine.py. Status propagation will never fire during normal agent execution. Either wire it in (e.g., call it from complete_run() / fail_run() after update_status) or document clearly that it is a utility for future integration. 3. LLM call explosion with no user warning With max_depth=3 and composite tasks generating up to 7 subtasks, worst-case is approximately 800 LLM calls for a single 'cf tasks generate --recursive'. The user gets no cost estimate or call-count warning. At max_depth=5 (the CLI limit), this becomes catastrophically expensive. At minimum, print an estimated call-count warning or require an explicit --confirm flag. Moderate Issues 4. _row_to_task positional indexing is fragile The new fields use positional row[14], row[15], row[16], row[17] with len(row) > N guards. This already existed for github_issue_number at row[13] and is a known maintenance risk - any query path reusing _row_to_task without the new columns selected will silently return wrong data. Consider using sqlite3.Row factory with named column access (row['parent_id']) to make this robust across future refactors. 5. Hierarchical ID sorting uses string comparison children_map[pid].sort(key=lambda t: t.hierarchical_id or t.title) - String sort puts '1.10' before '1.2'. Fix: key=lambda t: [int(p) for p in t.hierarchical_id.split('.')] if t.hierarchical_id else [999] 6. files_to_modify is appended to description rather than stored The LLM is asked to provide files_to_modify but the values are concatenated into the description string. This pollutes the description field and makes the data unusable programmatically. Either add a dedicated model field, or drop it from the prompt until it can be stored cleanly. Minor Observations
What Is Working Well
Two blockers before merge: (1) the PRD-as-description design bug in generate_task_tree, and (2) propagate_status not being wired in. The other items can be tracked as follow-up issues. |
Summary
Closes #420
complexity_score,estimated_hours,uncertainty_level,depends_on, andfiles_to_modify— fields that existed in the model but were never populatedclassify_task()→decompose_task()→generate_task_tree()breaks PRDs into hierarchical atomic/composite task trees (inspired by tinyagi/fractals)parent_id,lineage,is_leaf,hierarchical_idto Task model and DB schemacf tasks generate --recursive [--max-depth N]andcf tasks tree(ASCII hierarchy display)Files Changed
core/tasks.py,core/workspace.pycore/tasks.pycore/task_tree.py(NEW)core/context_packager.pycli/app.pytreecommandTest plan
cf tasks generatewith rich metadatacf tasks treewith hierarchical displaySummary by CodeRabbit
New Features
Tests