docs: update AsyncAgentScheduler page for exception-handling fixes + correct existing API drift (PraisonAI PR #1566)

## Context

Upstream PR **MervinPraison/PraisonAI#1566** (closes issue #1565) shipped two defensive bug-fixes to `AsyncAgentScheduler` in `src/praisonai/praisonai/async_agent_scheduler.py`. The PR is a pure reliability hardening — **no new public API, no parameter changes, no breaking behaviour**.

However, while reading the SDK against the existing docs I found three things that justify a documentation update:

1. The PR introduces a **new reliability guarantee** (`is_running` is always cleared, `stop()` is exception-safe) that users embedding the scheduler in long-running services like FastAPI should know about.
2. Two **pre-existing factual drifts** between the docs and the SDK that should be corrected at the same time.
3. Two overlapping pages (`async-agent-scheduler.mdx` and `async-scheduler.mdx`) that document the same class with slightly different content — a small consolidation opportunity.

This is a **content update** task — no new page. All changes land in existing files under `docs/features/`.

> **Folder placement:** updates only in `docs/features/`. Do **NOT** touch `docs/concepts/` (human-only) or auto-generated `docs/sdk/reference/*` files. Per `AGENTS.md` §1.8.

Upstream references:
- Merged PR: https://github.com/MervinPraison/PraisonAI/pull/1566
- Source issue: https://github.com/MervinPraison/PraisonAI/issues/1565
- Head SHA: `1e21ac221f8cadc5fb68ebb7f23998c29220ae08`
- Modified file: `src/praisonai/praisonai/async_agent_scheduler.py`
- New tests: `src/praisonai/tests/unit/scheduler/test_async_agent_scheduler.py`

---

## Summary of SDK changes that drive the doc update

| # | Method | What changed | User-visible? |
|---|---|---|---|
| 1 | `_run_schedule()` | Wrapped the entire `while not self._stop_event.is_set()` loop in `try / finally`. `self.is_running = False` is now executed even if `_execute_with_retry()` raises an unexpected exception. | ✅ Reliability guarantee |
| 2 | `stop()` | Wrapped the cancel/await flow in `try / finally`. Now catches both `asyncio.CancelledError` (silently — expected on shutdown) **and** generic `Exception` (logged via `logger.error`). `self.is_running = False` is now in the `finally` block. | ✅ Reliability guarantee — `stop()` no longer raises if the scheduler task crashed |

No new constructor params, no new methods, no behaviour change on the happy path. The fixes only manifest when an unexpected exception escapes the inner execution loop or when the scheduled task itself blows up during cancellation.

---

## Existing docs that need to change

Found via `mcp__github__search_code repo:MervinPraison/PraisonAIDocs AsyncAgentScheduler`:

| Path | Status | Action |
|---|---|---|
| `docs/features/async-agent-scheduler.mdx` | Canonical, detailed page | **Primary update target** |
| `docs/features/async-scheduler.mdx` | Older, overlapping page | Update + flag for consolidation |
| `docs/cli/scheduler.mdx` | CLI wrapper page (sync `praisonai schedule …`) | No changes needed for this PR — already covered by issue #261 |

---

## Required changes — page by page

### 1. `docs/features/async-agent-scheduler.mdx` (primary target)

#### 1a. Fix the `stop()` timeout — currently incorrect

The "Best Practices" accordion **"Always await scheduler.stop() before exiting"** (around line ~245) currently states:

> "The `stop()` method waits up to **30 seconds** for the current execution to complete before canceling."

The actual SDK code uses `timeout=10` (`async_agent_scheduler.py` line ~189). **Replace with "10 seconds"**, or — simpler — just say "waits for the current execution to complete, then cancels."

#### 1b. Fix the `get_stats()` field table — currently wrong

The **"get_stats() Response"** table (around line ~155) lists:

```
| execution_count | int  | Total number of execution attempts |
| success_count   | int  | Number of successful executions    |
| failure_count   | int  | Number of failed executions        |
| agent_name      | str  | Name of the agent being scheduled  |
| task            | str  | Task description being executed    |
```

The actual SDK (`get_stats()` lines ~217–226) returns:

```python
{
    "is_running": self.is_running,
    "total_executions": self._execution_count,
    "successful_executions": self._success_count,
    "failed_executions": self._failure_count,
    "success_rate": (self._success_count / self._execution_count * 100) if self._execution_count > 0 else 0
}
```

Replace the table with:

```
| Field                    | Type  | Description                                              |
|--------------------------|-------|----------------------------------------------------------|
| `is_running`             | bool  | Whether the scheduler loop is currently active          |
| `total_executions`       | int   | Total number of execution attempts                       |
| `successful_executions`  | int   | Number of successful executions                          |
| `failed_executions`      | int   | Number of failed executions                              |
| `success_rate`           | float | Successful / total × 100, or `0` when no executions yet |
```

> Note: the `agent_name` and `task` fields are **not** returned by `get_stats()` — remove them.

#### 1c. Fix the `success_count` reference in the "With Callbacks" example

The Quick Start "With Callbacks" example (around line ~85) ends with:

```python
print(f"Completed {stats['success_count']} successful executions")
```

`stats['success_count']` will raise `KeyError` against the real SDK. Replace with:

```python
print(f"Completed {stats['successful_executions']} successful executions")
```

#### 1d. `get_stats()` is sync, not async

The same example calls `await scheduler.get_stats()`. Looking at the SDK, `get_stats()` is defined as a **regular `def`**, not `async def` (line 216). Awaiting it will raise `TypeError`. Update both the Quick Start example and the third Common Pattern (`Graceful Shutdown on SIGINT`, line ~225) to call `scheduler.get_stats()` without `await`.

> Verify: `grep -n "def get_stats" src/praisonai/praisonai/async_agent_scheduler.py` — confirm it is `def`, not `async def`.

#### 1e. Add a new Best Practice: "Reliability and exception safety"

After the existing **"Always await scheduler.stop() before exiting"** accordion, add a new one:

```mdx
<Accordion title="The scheduler is exception-safe by design">
The scheduler guarantees that:

- **`is_running` always reflects reality.** When the internal scheduling loop exits — whether on a clean stop, an unhandled exception, or task cancellation — `is_running` is cleared in a `finally` block.
- **`stop()` never raises.** If the scheduler task crashed in the background, `stop()` logs the error via the standard `logging` module and still returns `True` so your shutdown path stays clean.

This means you can drive the scheduler from a `lifespan` context manager or signal handler without wrapping `stop()` in your own `try/except`:

```python
@asynccontextmanager
async def lifespan(app: FastAPI):
    scheduler = AsyncAgentScheduler(agent, task="…")
    await scheduler.start("hourly")
    yield
    await scheduler.stop()  # always safe, even if the task crashed earlier
```

To surface scheduler crashes in your own monitoring, configure the `praisonai.async_agent_scheduler` logger:

```python
import logging
logging.getLogger("praisonai.async_agent_scheduler").setLevel(logging.ERROR)
```
</Accordion>
```

(One sentence intro; agent-centric example; uses real import; references the actual logger name from the SDK.)

#### 1f. (Optional but recommended) Add a Mermaid diagram showing the cleanup guarantee

After the existing "How It Works" sequence diagram, add a short flow diagram so the new guarantee is visible:

```mermaid
graph TB
    A[🔄 Scheduler Loop] --> B{Exception?}
    B -->|No| C[✅ Clean exit]
    B -->|Yes - in _execute_with_retry| D[❗ Logged]
    D --> E[finally: is_running = False]
    C --> E
    F[👤 User calls stop&#40;&#41;] --> G{Task exception?}
    G -->|No| H[✅ Returns True]
    G -->|CancelledError| H
    G -->|Other Exception| I[❗ logger.error]
    I --> H

    classDef loop fill:#189AB4,stroke:#7C90A0,color:#fff
    classDef good fill:#10B981,stroke:#7C90A0,color:#fff
    classDef warn fill:#F59E0B,stroke:#7C90A0,color:#fff
    classDef user fill:#8B0000,stroke:#7C90A0,color:#fff

    class A loop
    class C,E,H good
    class D,I warn
    class F,G,B user
```

Standard color scheme per `AGENTS.md` §3.1.

### 2. `docs/features/async-scheduler.mdx`

This page covers the same `AsyncAgentScheduler` class with overlapping content. For this PR:

1. Apply the same `get_stats()` field-name fix (the page's `stats` reference is generic, but verify).
2. Add the same **"Reliability and exception safety"** accordion to the **Best Practices** section.
3. **Flag the duplication** (do not delete in this PR — leave a TODO comment in a hidden frontmatter field, or mention it in the implementing PR description). A future content-cleanup PR should consolidate these two pages into `docs/features/async-agent-scheduler.mdx` and add a redirect from `/docs/features/async-scheduler` in `docs.json`.

### 3. `docs/cli/scheduler.mdx`

No changes needed for PR #1566. The import-path issues on this page are already tracked in **issue #261** (PraisonAI PR #1552) — do not duplicate that work here.

---

## What NOT to do (per `AGENTS.md`)

- ❌ Do **not** create new files in `docs/concepts/`. All changes are updates inside `docs/features/`.
- ❌ Do **not** edit auto-generated SDK reference pages under `docs/sdk/reference/praisonai/` — those are regenerated from source.
- ❌ Do **not** add a new "Migration" or "Reliability" page — the change is small enough to stay inline as one Best-Practice accordion + a Mermaid diagram.
- ❌ Do **not** restructure the page or rewrite passages that are already correct. Surgical edits only.
- ❌ Do **not** touch `docs.json` "Concepts" group entries.
- ❌ Do **not** consolidate the two overlapping scheduler pages in this PR — that is a separate content-cleanup task. Just flag the duplication.

## Style & structure requirements

Follow `AGENTS.md` strictly:

- Lead each new accordion / section with **one sentence** of intro (§6.2).
- Use Mintlify `<Accordion>`, `<Note>`, `<Warning>` components for callouts (§4).
- Mermaid diagram uses the standard color scheme (§3.1) — `#8B0000` agents/users, `#189AB4` process, `#10B981` success, `#F59E0B` warning, `#fff` text.
- Code examples must run unmodified against the current `praisonai` package (§5.1) — keep imports complete, no placeholders.
- Agent-centric framing: every code snippet starts from an `Agent(...)` (§1.1.9).
- No forbidden phrases (§6.3) — "as you can see", "it is important to note", "in this section we will", etc.

## Acceptance criteria

- [ ] `docs/features/async-agent-scheduler.mdx` `stop()` timeout is corrected from "30 seconds" to "10 seconds" (or rephrased to omit the specific number).
- [ ] `docs/features/async-agent-scheduler.mdx` `get_stats()` table lists the **actual** fields: `is_running`, `total_executions`, `successful_executions`, `failed_executions`, `success_rate`. The non-existent `agent_name` and `task` fields are removed.
- [ ] The "With Callbacks" example uses `stats['successful_executions']`, not `stats['success_count']`.
- [ ] All `await scheduler.get_stats()` calls in code examples are changed to `scheduler.get_stats()` (it's sync).
- [ ] A new **"The scheduler is exception-safe by design"** accordion appears in the Best Practices section, mentioning both the `is_running` cleanup guarantee and `stop()`'s exception safety, with a working FastAPI lifespan example and a logger configuration tip.
- [ ] (Optional) A new flow Mermaid diagram visualises the cleanup guarantee, using the standard color scheme.
- [ ] `docs/features/async-scheduler.mdx` carries the same Best-Practice accordion. Page is flagged in the PR description as a duplication candidate for a future cleanup PR.
- [ ] No file in `docs/concepts/` is touched.
- [ ] No file in `docs/sdk/reference/` is touched.
- [ ] No file in `docs/cli/` is touched (covered separately by issue #261).
- [ ] All examples copy-paste-run against the current `praisonai` package.
- [ ] PR opened as draft against `claude/admiring-euler-Zh2aQ` (per session branch instructions).

## Verification commands for the implementing agent

```bash
# Confirm SDK truth before writing — these are the lines this issue is based on
sed -n '180,210p' src/praisonai/praisonai/async_agent_scheduler.py   # stop()
sed -n '215,245p' src/praisonai/praisonai/async_agent_scheduler.py   # get_stats() + _run_schedule() finally
grep -n "def get_stats" src/praisonai/praisonai/async_agent_scheduler.py   # confirm sync def

# Confirm no doc still claims the wrong field names after edits
grep -rn "success_count\|failure_count\|execution_count" docs/features/async-agent-scheduler.mdx
grep -rn "30 seconds" docs/features/async-agent-scheduler.mdx

# Confirm await get_stats() is gone
grep -rn "await.*get_stats" docs/features/
```

Generated from PR #1566 review by Claude per `AGENTS.md` documentation-creation cycle (read SDK → understand → document).


#	Method	What changed	User-visible?
1	`_run_schedule()`	Wrapped the entire `while not self._stop_event.is_set()` loop in `try / finally`. `self.is_running = False` is now executed even if `_execute_with_retry()` raises an unexpected exception.	✅ Reliability guarantee
2	`stop()`	Wrapped the cancel/await flow in `try / finally`. Now catches both `asyncio.CancelledError` (silently — expected on shutdown) and generic `Exception` (logged via `logger.error`). `self.is_running = False` is now in the `finally` block.	✅ Reliability guarantee — `stop()` no longer raises if the scheduler task crashed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: update AsyncAgentScheduler page for exception-handling fixes + correct existing API drift (PraisonAI PR #1566) #269

Context

Summary of SDK changes that drive the doc update

Existing docs that need to change

Required changes — page by page

1. `docs/features/async-agent-scheduler.mdx` (primary target)

1a. Fix the `stop()` timeout — currently incorrect

1b. Fix the `get_stats()` field table — currently wrong

1c. Fix the `success_count` reference in the "With Callbacks" example

1d. `get_stats()` is sync, not async

1e. Add a new Best Practice: "Reliability and exception safety"

1f. (Optional but recommended) Add a Mermaid diagram showing the cleanup guarantee

2. `docs/features/async-scheduler.mdx`

3. `docs/cli/scheduler.mdx`

What NOT to do (per `AGENTS.md`)

Style & structure requirements

Acceptance criteria

Verification commands for the implementing agent

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Path	Status	Action
`docs/features/async-agent-scheduler.mdx`	Canonical, detailed page	Primary update target
`docs/features/async-scheduler.mdx`	Older, overlapping page	Update + flag for consolidation
`docs/cli/scheduler.mdx`	CLI wrapper page (sync `praisonai schedule …`)	No changes needed for this PR — already covered by issue #261

docs: update AsyncAgentScheduler page for exception-handling fixes + correct existing API drift (PraisonAI PR #1566) #269

Description

Context

Summary of SDK changes that drive the doc update

Existing docs that need to change

Required changes — page by page

1. docs/features/async-agent-scheduler.mdx (primary target)

1a. Fix the stop() timeout — currently incorrect

1b. Fix the get_stats() field table — currently wrong

1c. Fix the success_count reference in the "With Callbacks" example

1d. get_stats() is sync, not async

1e. Add a new Best Practice: "Reliability and exception safety"

1f. (Optional but recommended) Add a Mermaid diagram showing the cleanup guarantee

2. docs/features/async-scheduler.mdx

3. docs/cli/scheduler.mdx

What NOT to do (per AGENTS.md)

Style & structure requirements

Acceptance criteria

Verification commands for the implementing agent

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. `docs/features/async-agent-scheduler.mdx` (primary target)

1a. Fix the `stop()` timeout — currently incorrect

1b. Fix the `get_stats()` field table — currently wrong

1c. Fix the `success_count` reference in the "With Callbacks" example

1d. `get_stats()` is sync, not async

2. `docs/features/async-scheduler.mdx`

3. `docs/cli/scheduler.mdx`

What NOT to do (per `AGENTS.md`)