Skip to content

docs: update AsyncAgentScheduler page for exception-handling fixes + correct existing API drift (PraisonAI PR #1566) #269

@MervinPraison

Description

@MervinPraison

Context

Upstream PR MervinPraison/PraisonAI#1566 (closes issue #1565) shipped two defensive bug-fixes to AsyncAgentScheduler in src/praisonai/praisonai/async_agent_scheduler.py. The PR is a pure reliability hardening — no new public API, no parameter changes, no breaking behaviour.

However, while reading the SDK against the existing docs I found three things that justify a documentation update:

  1. The PR introduces a new reliability guarantee (is_running is always cleared, stop() is exception-safe) that users embedding the scheduler in long-running services like FastAPI should know about.
  2. Two pre-existing factual drifts between the docs and the SDK that should be corrected at the same time.
  3. Two overlapping pages (async-agent-scheduler.mdx and async-scheduler.mdx) that document the same class with slightly different content — a small consolidation opportunity.

This is a content update task — no new page. All changes land in existing files under docs/features/.

Folder placement: updates only in docs/features/. Do NOT touch docs/concepts/ (human-only) or auto-generated docs/sdk/reference/* files. Per AGENTS.md §1.8.

Upstream references:


Summary of SDK changes that drive the doc update

# Method What changed User-visible?
1 _run_schedule() Wrapped the entire while not self._stop_event.is_set() loop in try / finally. self.is_running = False is now executed even if _execute_with_retry() raises an unexpected exception. ✅ Reliability guarantee
2 stop() Wrapped the cancel/await flow in try / finally. Now catches both asyncio.CancelledError (silently — expected on shutdown) and generic Exception (logged via logger.error). self.is_running = False is now in the finally block. ✅ Reliability guarantee — stop() no longer raises if the scheduler task crashed

No new constructor params, no new methods, no behaviour change on the happy path. The fixes only manifest when an unexpected exception escapes the inner execution loop or when the scheduled task itself blows up during cancellation.


Existing docs that need to change

Found via mcp__github__search_code repo:MervinPraison/PraisonAIDocs AsyncAgentScheduler:

Path Status Action
docs/features/async-agent-scheduler.mdx Canonical, detailed page Primary update target
docs/features/async-scheduler.mdx Older, overlapping page Update + flag for consolidation
docs/cli/scheduler.mdx CLI wrapper page (sync praisonai schedule …) No changes needed for this PR — already covered by issue #261

Required changes — page by page

1. docs/features/async-agent-scheduler.mdx (primary target)

1a. Fix the stop() timeout — currently incorrect

The "Best Practices" accordion "Always await scheduler.stop() before exiting" (around line ~245) currently states:

"The stop() method waits up to 30 seconds for the current execution to complete before canceling."

The actual SDK code uses timeout=10 (async_agent_scheduler.py line ~189). Replace with "10 seconds", or — simpler — just say "waits for the current execution to complete, then cancels."

1b. Fix the get_stats() field table — currently wrong

The "get_stats() Response" table (around line ~155) lists:

| execution_count | int  | Total number of execution attempts |
| success_count   | int  | Number of successful executions    |
| failure_count   | int  | Number of failed executions        |
| agent_name      | str  | Name of the agent being scheduled  |
| task            | str  | Task description being executed    |

The actual SDK (get_stats() lines ~217–226) returns:

{
    "is_running": self.is_running,
    "total_executions": self._execution_count,
    "successful_executions": self._success_count,
    "failed_executions": self._failure_count,
    "success_rate": (self._success_count / self._execution_count * 100) if self._execution_count > 0 else 0
}

Replace the table with:

| Field                    | Type  | Description                                              |
|--------------------------|-------|----------------------------------------------------------|
| `is_running`             | bool  | Whether the scheduler loop is currently active          |
| `total_executions`       | int   | Total number of execution attempts                       |
| `successful_executions`  | int   | Number of successful executions                          |
| `failed_executions`      | int   | Number of failed executions                              |
| `success_rate`           | float | Successful / total × 100, or `0` when no executions yet |

Note: the agent_name and task fields are not returned by get_stats() — remove them.

1c. Fix the success_count reference in the "With Callbacks" example

The Quick Start "With Callbacks" example (around line ~85) ends with:

print(f"Completed {stats['success_count']} successful executions")

stats['success_count'] will raise KeyError against the real SDK. Replace with:

print(f"Completed {stats['successful_executions']} successful executions")

1d. get_stats() is sync, not async

The same example calls await scheduler.get_stats(). Looking at the SDK, get_stats() is defined as a regular def, not async def (line 216). Awaiting it will raise TypeError. Update both the Quick Start example and the third Common Pattern (Graceful Shutdown on SIGINT, line ~225) to call scheduler.get_stats() without await.

Verify: grep -n "def get_stats" src/praisonai/praisonai/async_agent_scheduler.py — confirm it is def, not async def.

1e. Add a new Best Practice: "Reliability and exception safety"

After the existing "Always await scheduler.stop() before exiting" accordion, add a new one:

<Accordion title="The scheduler is exception-safe by design">
The scheduler guarantees that:

- **`is_running` always reflects reality.** When the internal scheduling loop exits — whether on a clean stop, an unhandled exception, or task cancellation — `is_running` is cleared in a `finally` block.
- **`stop()` never raises.** If the scheduler task crashed in the background, `stop()` logs the error via the standard `logging` module and still returns `True` so your shutdown path stays clean.

This means you can drive the scheduler from a `lifespan` context manager or signal handler without wrapping `stop()` in your own `try/except`:

```python
@asynccontextmanager
async def lifespan(app: FastAPI):
    scheduler = AsyncAgentScheduler(agent, task="")
    await scheduler.start("hourly")
    yield
    await scheduler.stop()  # always safe, even if the task crashed earlier

To surface scheduler crashes in your own monitoring, configure the praisonai.async_agent_scheduler logger:

import logging
logging.getLogger("praisonai.async_agent_scheduler").setLevel(logging.ERROR)
```

(One sentence intro; agent-centric example; uses real import; references the actual logger name from the SDK.)

1f. (Optional but recommended) Add a Mermaid diagram showing the cleanup guarantee

After the existing "How It Works" sequence diagram, add a short flow diagram so the new guarantee is visible:

graph TB
    A[🔄 Scheduler Loop] --> B{Exception?}
    B -->|No| C[✅ Clean exit]
    B -->|Yes - in _execute_with_retry| D[❗ Logged]
    D --> E[finally: is_running = False]
    C --> E
    F[👤 User calls stop&#40;&#41;] --> G{Task exception?}
    G -->|No| H[✅ Returns True]
    G -->|CancelledError| H
    G -->|Other Exception| I[❗ logger.error]
    I --> H

    classDef loop fill:#189AB4,stroke:#7C90A0,color:#fff
    classDef good fill:#10B981,stroke:#7C90A0,color:#fff
    classDef warn fill:#F59E0B,stroke:#7C90A0,color:#fff
    classDef user fill:#8B0000,stroke:#7C90A0,color:#fff

    class A loop
    class C,E,H good
    class D,I warn
    class F,G,B user
Loading

Standard color scheme per AGENTS.md §3.1.

2. docs/features/async-scheduler.mdx

This page covers the same AsyncAgentScheduler class with overlapping content. For this PR:

  1. Apply the same get_stats() field-name fix (the page's stats reference is generic, but verify).
  2. Add the same "Reliability and exception safety" accordion to the Best Practices section.
  3. Flag the duplication (do not delete in this PR — leave a TODO comment in a hidden frontmatter field, or mention it in the implementing PR description). A future content-cleanup PR should consolidate these two pages into docs/features/async-agent-scheduler.mdx and add a redirect from /docs/features/async-scheduler in docs.json.

3. docs/cli/scheduler.mdx

No changes needed for PR #1566. The import-path issues on this page are already tracked in issue #261 (PraisonAI PR #1552) — do not duplicate that work here.


What NOT to do (per AGENTS.md)

  • ❌ Do not create new files in docs/concepts/. All changes are updates inside docs/features/.
  • ❌ Do not edit auto-generated SDK reference pages under docs/sdk/reference/praisonai/ — those are regenerated from source.
  • ❌ Do not add a new "Migration" or "Reliability" page — the change is small enough to stay inline as one Best-Practice accordion + a Mermaid diagram.
  • ❌ Do not restructure the page or rewrite passages that are already correct. Surgical edits only.
  • ❌ Do not touch docs.json "Concepts" group entries.
  • ❌ Do not consolidate the two overlapping scheduler pages in this PR — that is a separate content-cleanup task. Just flag the duplication.

Style & structure requirements

Follow AGENTS.md strictly:

  • Lead each new accordion / section with one sentence of intro (§6.2).
  • Use Mintlify <Accordion>, <Note>, <Warning> components for callouts (§4).
  • Mermaid diagram uses the standard color scheme (§3.1) — #8B0000 agents/users, #189AB4 process, #10B981 success, #F59E0B warning, #fff text.
  • Code examples must run unmodified against the current praisonai package (§5.1) — keep imports complete, no placeholders.
  • Agent-centric framing: every code snippet starts from an Agent(...) (§1.1.9).
  • No forbidden phrases (§6.3) — "as you can see", "it is important to note", "in this section we will", etc.

Acceptance criteria

  • docs/features/async-agent-scheduler.mdx stop() timeout is corrected from "30 seconds" to "10 seconds" (or rephrased to omit the specific number).
  • docs/features/async-agent-scheduler.mdx get_stats() table lists the actual fields: is_running, total_executions, successful_executions, failed_executions, success_rate. The non-existent agent_name and task fields are removed.
  • The "With Callbacks" example uses stats['successful_executions'], not stats['success_count'].
  • All await scheduler.get_stats() calls in code examples are changed to scheduler.get_stats() (it's sync).
  • A new "The scheduler is exception-safe by design" accordion appears in the Best Practices section, mentioning both the is_running cleanup guarantee and stop()'s exception safety, with a working FastAPI lifespan example and a logger configuration tip.
  • (Optional) A new flow Mermaid diagram visualises the cleanup guarantee, using the standard color scheme.
  • docs/features/async-scheduler.mdx carries the same Best-Practice accordion. Page is flagged in the PR description as a duplication candidate for a future cleanup PR.
  • No file in docs/concepts/ is touched.
  • No file in docs/sdk/reference/ is touched.
  • No file in docs/cli/ is touched (covered separately by issue docs: update scheduler / tool_resolver / telemetry pages for wrapper-layer refactor (PraisonAI PR #1552) #261).
  • All examples copy-paste-run against the current praisonai package.
  • PR opened as draft against claude/admiring-euler-Zh2aQ (per session branch instructions).

Verification commands for the implementing agent

# Confirm SDK truth before writing — these are the lines this issue is based on
sed -n '180,210p' src/praisonai/praisonai/async_agent_scheduler.py   # stop()
sed -n '215,245p' src/praisonai/praisonai/async_agent_scheduler.py   # get_stats() + _run_schedule() finally
grep -n "def get_stats" src/praisonai/praisonai/async_agent_scheduler.py   # confirm sync def

# Confirm no doc still claims the wrong field names after edits
grep -rn "success_count\|failure_count\|execution_count" docs/features/async-agent-scheduler.mdx
grep -rn "30 seconds" docs/features/async-agent-scheduler.mdx

# Confirm await get_stats() is gone
grep -rn "await.*get_stats" docs/features/

Generated from PR #1566 review by Claude per AGENTS.md documentation-creation cycle (read SDK → understand → document).

Metadata

Metadata

Assignees

No one assigned

    Labels

    claudeTrigger Claude Code analysisdocumentationImprovements or additions to documentation

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions