docs: Document non-fatal error visibility features#238
docs: Document non-fatal error visibility features#238MervinPraison wants to merge 1 commit intomainfrom
Conversation
- Add comprehensive documentation for TaskOutput.callback_error and non_fatal_errors fields - Document Task.non_fatal_errors attribute for error tracking - Include async memory kwargs passthrough documentation - Add to Safety & Control section in docs.json - Follow AGENTS.md standards with hero Mermaid diagrams, agent-centric examples - Include best practices for error handling in workflows Co-authored-by: Mervin Praison <MervinPraison@users.noreply.github.com>
📝 WalkthroughWalkthroughA new documentation page for non-fatal errors is created, describing how callback failures and memory-operation errors are captured without halting workflows. The page is registered in the site navigation configuration under the "Safety & Control" section. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces documentation for a new 'Non-Fatal Errors' feature, including a new MDX file and an update to the documentation configuration. However, the review highlights significant discrepancies between the documentation and the actual implementation. Specifically, the documentation references attributes like task.output.callback_error, task.output.non_fatal_errors, and task.non_fatal_errors, which are either incorrectly named or entirely missing from the current codebase, leading to potential runtime errors for users following the guide.
| if task.output.callback_error: | ||
| print("Callback failed:", task.output.callback_error) | ||
| if task.output.non_fatal_errors: | ||
| for err in task.output.non_fatal_errors: | ||
| print("Non-fatal:", err) | ||
| ``` |
There was a problem hiding this comment.
There are significant discrepancies between this documentation and the implementation in praisonaiagents/task/task.py:
- Attribute Name: The execution result is stored in
task.result, nottask.output. In theTaskclass,self.output(initialized at line 320 oftask.py) stores the configuration parameter (e.g., a file path or Pydantic model), not the execution output object. Accessingtask.output.callback_errorwill fail. - Missing Implementation: The
callback_errorandnon_fatal_errorsfields are not currently implemented or populated in theTaskorTaskOutputclasses. Theexecute_callbackmethod intask.py(lines 623-734) logs exceptions but does not store them in the output object.
This example will raise an AttributeError at runtime.
| for err in task.non_fatal_errors: | ||
| print(err) # e.g. "callback: downstream webhook timed out" | ||
| ``` |
There was a problem hiding this comment.
| | Surface | Type | When populated | | ||
| |---------|------|----------------| | ||
| | `TaskOutput.callback_error` | `Optional[str]` | Only if a `callback` function raises | | ||
| | `TaskOutput.non_fatal_errors` | `Optional[list[str]]` | All non-fatal errors (memory ops + callback) from this run | | ||
| | `Task.non_fatal_errors` | `list[str]` | Same list, available on the Task instance directly | |
There was a problem hiding this comment.
This table describes API surfaces that do not exist in the provided source code. Neither TaskOutput nor Task currently support these error-tracking fields. The documentation should be aligned with the actual framework capabilities or the corresponding features should be implemented in praisonaiagents/task/task.py.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
docs/features/non-fatal-errors.mdx (1)
56-62: Step 2 snippet is a bare fragment — confirm Mintlify Steps renders this acceptably.Step 2's code block depends entirely on Step 1's
taskvariable via the# Same workflow as abovecomment. That's a reasonable pedagogical choice, but it means copy-pasting Step 2 alone raisesNameError. If the rendered Steps component displays the two code blocks close together this is fine; otherwise consider consolidating into a single code block under Step 1 with both access patterns shown, to keep each snippet independently runnable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/features/non-fatal-errors.mdx` around lines 56 - 62, The Step 2 snippet references task.non_fatal_errors but depends on the `task` created in Step 1, which makes it non-runnable if Steps are rendered separately; update the docs by either consolidating the two Steps into a single code block that creates `task` and then iterates over `task.non_fatal_errors`, or add a short, self-contained setup line in Step 2 that reconstructs `task` (or a minimal stub) before the loop; locate the code in the Steps component (Step title "Inspecting accumulated errors on the Task itself") and modify the snippet so `task.non_fatal_errors` is shown with a runnable context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/features/non-fatal-errors.mdx`:
- Around line 135-151: The example calls send_alert(...) which is undefined; to
make Pattern B runnable, add a simple alert implementation or replace calls with
a standard output/logger call—e.g., define a stub function named send_alert(msg)
that prints or logs the message and ensure alert_on_errors uses that; verify
Task, alert_on_errors, and PraisonAIAgents remain unchanged except for adding
the send_alert definition or switching the call to print/logger so the snippet
runs without modification.
- Around line 22-29: The Mermaid class color mappings in the hero diagram are
inverted versus the §3.1 palette: update the classDef declarations so input uses
`#8B0000`, process uses `#189AB4`, output uses `#8B0000` (agents/outputs share dark
red), error remains `#8B0000` if it's an agent/output or use `#F59E0B` only for
warnings, and configuration stays `#6366F1`; then ensure the class assignments
(class Task, class Run,CB, class OK,Check, class Capture) reflect those
semantics (Task → input/dark red, Run/CB → process/teal, OK/Check → output/dark
red or success green if meant as success nodes, Capture → error/warning as
appropriate). Apply the same remapping to the User Interaction Flow diagram
instances that use the same class names so all diagrams follow the AGENTS.md
§3.1 color scheme.
- Around line 186-232: The page is missing the required "Configuration Options"
section and a brief "Why it matters" note, and doesn't explicitly mention async
memory kwargs passthrough; add a "Configuration Options" table stub between
"Common Patterns" and "Best Practices" that states "No configuration required —
errors are captured automatically" (Option/Type/Default/Description), insert a
one-paragraph "Why it matters" callout before "Related" summarizing the related
bug-fix changes (thread-safety, async STM fallback fix, duplicate finalize
removal) and their impact on non-fatal error behavior, and add a one-line
mention in the "Memory failures are non-fatal by design" accordion referencing
store_short_term_async and store_long_term_async to clarify that async memory
kwargs passthrough surface memory-side errors through
task.output.non_fatal_errors (and retain reference to quality_check and
non_fatal_errors).
- Around line 101-105: The docs reference missing runtime fields and other
documentation issues: add Optional[str] callback_error and Optional[list[str]]
non_fatal_errors to the TaskOutput dataclass and add list[str] non_fatal_errors
to the Task class so examples referencing TaskOutput.callback_error,
TaskOutput.non_fatal_errors, and Task.non_fatal_errors work; update all code
examples (Quick Start, Pattern A/B, Memory errors) to use those fields and
ensure Pattern B imports the send_alert function (or replace with a documented
helper) so examples run as-is; add the required "Configuration Options" and "Why
it matters" sections (mentioning thread-safety, async STM fallback, duplicate
finalize removal) to the page; change the Mermaid diagrams’ node color
assignments so indigo (`#6366F1`) is used for configuration nodes and amber
(`#F59E0B`) for warnings/process nodes to match the style guide; and expand the
accordion into a short subsection explicitly documenting the
store_short_term_async and store_long_term_async kwargs passthrough for async
memory operations.
---
Nitpick comments:
In `@docs/features/non-fatal-errors.mdx`:
- Around line 56-62: The Step 2 snippet references task.non_fatal_errors but
depends on the `task` created in Step 1, which makes it non-runnable if Steps
are rendered separately; update the docs by either consolidating the two Steps
into a single code block that creates `task` and then iterates over
`task.non_fatal_errors`, or add a short, self-contained setup line in Step 2
that reconstructs `task` (or a minimal stub) before the loop; locate the code in
the Steps component (Step title "Inspecting accumulated errors on the Task
itself") and modify the snippet so `task.non_fatal_errors` is shown with a
runnable context.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f230745f-4149-43c3-a7b4-6a47fbcfd11d
📒 Files selected for processing (2)
docs.jsondocs/features/non-fatal-errors.mdx
| classDef input fill:#6366F1,stroke:#7C90A0,color:#fff | ||
| classDef process fill:#F59E0B,stroke:#7C90A0,color:#fff | ||
| classDef output fill:#10B981,stroke:#7C90A0,color:#fff | ||
| classDef error fill:#8B0000,stroke:#7C90A0,color:#fff | ||
| class Task input | ||
| class Run,CB process | ||
| class OK,Check output | ||
| class Capture error |
There was a problem hiding this comment.
Mermaid classDef colors don't match the AGENTS.md §3.1 semantic mapping.
Per the coding guideline, #8B0000 is for agents/inputs/outputs, #189AB4 for tools/processes, #10B981 for success, #F59E0B for warnings, #6366F1 for configuration. In the hero diagram (and again in User Interaction Flow, lines 176–184), the mapping is inverted: Task (an input) is indigo-configuration, Run/CB (processes) are amber-warning instead of teal, and TaskOutput is colored as "success" green rather than the outputs red. Re-map classes so node semantics match the §3.1 palette, or at minimum use #189AB4 for processes.
🎨 Suggested remap for the hero diagram
- classDef input fill:`#6366F1`,stroke:`#7C90A0`,color:`#fff`
- classDef process fill:`#F59E0B`,stroke:`#7C90A0`,color:`#fff`
- classDef output fill:`#10B981`,stroke:`#7C90A0`,color:`#fff`
- classDef error fill:`#8B0000`,stroke:`#7C90A0`,color:`#fff`
- class Task input
- class Run,CB process
- class OK,Check output
- class Capture error
+ classDef input fill:`#8B0000`,stroke:`#7C90A0`,color:`#fff`
+ classDef process fill:`#189AB4`,stroke:`#7C90A0`,color:`#fff`
+ classDef success fill:`#10B981`,stroke:`#7C90A0`,color:`#fff`
+ classDef warning fill:`#F59E0B`,stroke:`#7C90A0`,color:`#fff`
+ class Task input
+ class Run,CB process
+ class OK success
+ class Check,Capture warningAs per coding guidelines: "Use exact Mermaid color scheme for diagrams: #8B0000 (Dark Red) for agents/inputs/outputs, #189AB4 (Teal) for tools/processes, #10B981 (Green) for success, #F59E0B (Amber) for warnings, #6366F1 (Indigo) for configuration".
Also applies to: 176-184
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/features/non-fatal-errors.mdx` around lines 22 - 29, The Mermaid class
color mappings in the hero diagram are inverted versus the §3.1 palette: update
the classDef declarations so input uses `#8B0000`, process uses `#189AB4`, output
uses `#8B0000` (agents/outputs share dark red), error remains `#8B0000` if it's an
agent/output or use `#F59E0B` only for warnings, and configuration stays `#6366F1`;
then ensure the class assignments (class Task, class Run,CB, class OK,Check,
class Capture) reflect those semantics (Task → input/dark red, Run/CB →
process/teal, OK/Check → output/dark red or success green if meant as success
nodes, Capture → error/warning as appropriate). Apply the same remapping to the
User Interaction Flow diagram instances that use the same class names so all
diagrams follow the AGENTS.md §3.1 color scheme.
| | Surface | Type | When populated | | ||
| |---------|------|----------------| | ||
| | `TaskOutput.callback_error` | `Optional[str]` | Only if a `callback` function raises | | ||
| | `TaskOutput.non_fatal_errors` | `Optional[list[str]]` | All non-fatal errors (memory ops + callback) from this run | | ||
| | `Task.non_fatal_errors` | `list[str]` | Same list, available on the Task instance directly | |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm whether callback_error / non_fatal_errors exist on TaskOutput and Task
# at the SDK SHA referenced by the PR (008805b).
rg -nP --type=py -C3 '\b(callback_error|non_fatal_errors)\b'
echo "---- TaskOutput class definition ----"
ast-grep --pattern 'class TaskOutput($_):
$$$'
echo "---- Task class field / attribute for non_fatal_errors ----"
ast-grep --pattern 'class Task($_):
$$$'
echo "---- How task.output gets populated ----"
rg -nP --type=py -C2 '\bself\.output\s*='Repository: MervinPraison/PraisonAIDocs
Length of output: 5158
TaskOutput does not have callback_error or non_fatal_errors fields in the current SDK implementation.
The table at lines 101-105 and all code examples (Quick Start, Pattern A, Pattern B, Memory errors example) reference TaskOutput.callback_error, TaskOutput.non_fatal_errors, and Task.non_fatal_errors fields that do not exist in praisonaiagents/main.py:669-699. Verification confirms the TaskOutput class only defines description, summary, raw, pydantic, json_dict, agent, output_format, and token_metrics. This means all code snippets using these fields will fail at runtime.
Additionally:
- Lines 135-151 (Pattern B): Uses undefined
send_alertfunction without import; violates "every Python code example must include all necessary imports and run without modification." - Missing Configuration Options section: Guideline requires all documentation pages include this section.
- Missing "Why it matters" section: PR objective requires brief note on related bug-fixes (thread-safety, async STM fallback, duplicate finalize removal).
- Mermaid color mismatch (lines 10-30, 157-185): Hero and User Interaction Flow diagrams use indigo (
#6366F1) for Task (input) and amber (#F59E0B) for process nodes, but guidelines define indigo for configuration and amber for warnings—semantics do not align. - Async memory kwargs passthrough: Only briefly alluded to in accordion; no direct mention of
store_short_term_async/store_long_term_asynckwargs passthrough as stated in PR objectives.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/features/non-fatal-errors.mdx` around lines 101 - 105, The docs
reference missing runtime fields and other documentation issues: add
Optional[str] callback_error and Optional[list[str]] non_fatal_errors to the
TaskOutput dataclass and add list[str] non_fatal_errors to the Task class so
examples referencing TaskOutput.callback_error, TaskOutput.non_fatal_errors, and
Task.non_fatal_errors work; update all code examples (Quick Start, Pattern A/B,
Memory errors) to use those fields and ensure Pattern B imports the send_alert
function (or replace with a documented helper) so examples run as-is; add the
required "Configuration Options" and "Why it matters" sections (mentioning
thread-safety, async STM fallback, duplicate finalize removal) to the page;
change the Mermaid diagrams’ node color assignments so indigo (`#6366F1`) is used
for configuration nodes and amber (`#F59E0B`) for warnings/process nodes to match
the style guide; and expand the accordion into a short subsection explicitly
documenting the store_short_term_async and store_long_term_async kwargs
passthrough for async memory operations.
| ```python | ||
| from praisonaiagents import Agent, Task, PraisonAIAgents | ||
|
|
||
| def alert_on_errors(task): | ||
| if task.non_fatal_errors: | ||
| # Send to monitoring system | ||
| for error in task.non_fatal_errors: | ||
| send_alert(f"Task {task.description}: {error}") | ||
|
|
||
| agent = Agent(name="Processor", instructions="Process data.") | ||
| task = Task(description="Important task", agent=agent) | ||
|
|
||
| workflow = PraisonAIAgents(agents=[agent], tasks=[task]) | ||
| workflow.start() | ||
|
|
||
| alert_on_errors(task) | ||
| ``` |
There was a problem hiding this comment.
Pattern B example is not runnable — send_alert is undefined.
The snippet calls send_alert(...) without importing or defining it, so copy-pasting will raise NameError. Either inline a stub (e.g. def send_alert(msg): print(msg)) or replace with a print/logger call to satisfy the "runs without modification" requirement.
🛠️ Proposed fix
def alert_on_errors(task):
if task.non_fatal_errors:
- # Send to monitoring system
+ # Send to monitoring system (replace with your alerting client)
for error in task.non_fatal_errors:
- send_alert(f"Task {task.description}: {error}")
+ print(f"ALERT — Task {task.description}: {error}")As per coding guidelines: "Every Python code example must include all necessary imports and run without modification".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/features/non-fatal-errors.mdx` around lines 135 - 151, The example calls
send_alert(...) which is undefined; to make Pattern B runnable, add a simple
alert implementation or replace calls with a standard output/logger call—e.g.,
define a stub function named send_alert(msg) that prints or logs the message and
ensure alert_on_errors uses that; verify Task, alert_on_errors, and
PraisonAIAgents remain unchanged except for adding the send_alert definition or
switching the call to print/logger so the snippet runs without modification.
|
|
||
| --- | ||
|
|
||
| ## Best Practices | ||
|
|
||
| <AccordionGroup> | ||
| <Accordion title="Don't treat non-fatal errors as success"> | ||
| Always inspect the output before chaining tasks. A task that completed with non-fatal errors may not have produced the expected result. | ||
|
|
||
| ```python | ||
| if task.output.non_fatal_errors: | ||
| print("Task completed with issues:", task.output.non_fatal_errors) | ||
| # Decide whether to continue or retry | ||
| ``` | ||
| </Accordion> | ||
|
|
||
| <Accordion title="Memory failures are non-fatal by design"> | ||
| If `quality_check=True` was requested but the memory adapter isn't ready, the task still completes. Check `non_fatal_errors` for memory-related issues. | ||
|
|
||
| ```python | ||
| # Look for memory-specific errors | ||
| memory_errors = [err for err in task.non_fatal_errors | ||
| if "memory" in err.lower()] | ||
| if memory_errors: | ||
| print("Memory operations failed:", memory_errors) | ||
| ``` | ||
| </Accordion> | ||
|
|
||
| <Accordion title="Raise explicitly for critical callbacks"> | ||
| If a callback must block the workflow on failure, re-raise inside it. The framework currently captures the message but does NOT re-raise automatically. | ||
|
|
||
| ```python | ||
| def critical_callback(output): | ||
| try: | ||
| # Critical operation | ||
| result = send_to_external_api(output.raw) | ||
| except Exception as e: | ||
| # Log the error and re-raise to stop the workflow | ||
| log.error(f"Critical callback failed: {e}") | ||
| raise # This will stop the entire workflow | ||
| ``` | ||
| </Accordion> | ||
| </AccordionGroup> | ||
|
|
||
| --- | ||
|
|
||
| ## Related |
There was a problem hiding this comment.
Missing "Configuration Options" section and the PR-required "Why it matters" note.
Two structural gaps:
- The standard page template requires a
Configuration Optionssection (table of Option / Type / Default / Description) between Common Patterns and Best Practices — currently absent. Even if this feature has no user-facing config, a brief "No configuration required — errors are captured automatically" stub keeps the template consistent. - PR objectives explicitly call for a brief Why it matters note mentioning related bug-fix changes (thread-safety, async STM fallback fix, duplicate finalize removal). That note is not present on the page. A one-paragraph callout near the top (after the hero diagram) or a dedicated section before
Relatedwould satisfy this.
Additionally, the PR objectives mention async memory kwargs passthrough (store_short_term_async / store_long_term_async) should be referenced in context — currently only indirectly hinted at via the quality_check accordion. Consider adding a one-line mention so readers understand memory-side errors surface through the same non_fatal_errors list.
As per coding guidelines: "All documentation pages must follow the standard page structure template with frontmatter, hero diagram, Quick Start, How It Works, Configuration Options, Common Patterns, Best Practices, and Related sections".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/features/non-fatal-errors.mdx` around lines 186 - 232, The page is
missing the required "Configuration Options" section and a brief "Why it
matters" note, and doesn't explicitly mention async memory kwargs passthrough;
add a "Configuration Options" table stub between "Common Patterns" and "Best
Practices" that states "No configuration required — errors are captured
automatically" (Option/Type/Default/Description), insert a one-paragraph "Why it
matters" callout before "Related" summarizing the related bug-fix changes
(thread-safety, async STM fallback fix, duplicate finalize removal) and their
impact on non-fatal error behavior, and add a one-line mention in the "Memory
failures are non-fatal by design" accordion referencing store_short_term_async
and store_long_term_async to clarify that async memory kwargs passthrough
surface memory-side errors through task.output.non_fatal_errors (and retain
reference to quality_check and non_fatal_errors).
Summary
Documents the new non-fatal error visibility features from upstream PR MervinPraison/PraisonAI#1529.
Changes
Documentation Features
✅ AGENTS.md compliant:
✅ Best practices included:
Navigation
Added to Safety & Control section in docs.json between guardrails and security features.
Fixes #237
🤖 Generated with Claude Code
Summary by CodeRabbit