Skip to content

docs: document concurrency safety behaviour from PraisonAI PR #1459 #182

@MervinPraison

Description

@MervinPraison

Context

PraisonAI PR #1459 (merged 2026-04-18, fixes #1458) landed three concurrency-safety fixes in praisonaiagents Core SDK. The fixes are mostly internal, but one is an observable behaviour change that end-users can hit, and two give users new guarantees worth documenting. There is currently no documentation that covers agent-level concurrency limits, tool-execution timeouts, or plugin thread-safety in docs/features/.

Another agent will create/update the MDX file(s) based on this issue.


Summary of SDK changes (ground truth for the docs)

All three changes are in praisonaiagents/ (repo-root in PraisonAIDocs, synced daily from PraisonAI).

1. ConcurrencyRegistry.acquire_sync() — observable behaviour change ⚠️

File: praisonaiagents/agent/concurrency.py

Before: when called while an event loop was running, it silently decremented the private asyncio.Semaphore._value attribute and logged a warning — leaving the registry in an inconsistent state.

After: it raises RuntimeError("acquire_sync('<agent_name>') cannot be called with a running event loop; use async acquire() in async contexts."). When no loop is running it creates a fresh loop with asyncio.new_event_loop() and run_until_complete(sem.acquire()).

User-visible rule:

  • Sync contexts → call registry.acquire_sync(agent_name) / registry.release(agent_name)
  • Async contexts → await registry.acquire(agent_name) / registry.release(agent_name)
  • Mixing them now fails fast instead of corrupting state.

2. Reusable tool executor — resource-leak fix

File: praisonaiagents/agent/tool_execution.py

Before: created a brand-new ThreadPoolExecutor(max_workers=1) per tool call, with ad-hoc shutdown in every branch → resource leak under load.

After: a single per-agent executor is lazily created and reused:

if not hasattr(self, '_tool_executor'):
    self._tool_executor = concurrent.futures.ThreadPoolExecutor(
        max_workers=2, thread_name_prefix=f"tool-{self.name}"
    )
future = self._tool_executor.submit(ctx.run, execute_with_context)
try:
    result = future.result(timeout=tool_timeout)
except concurrent.futures.TimeoutError:
    future.cancel()
    logging.warning(f"Tool {function_name} timed out after {tool_timeout}s")
    result = {"error": f"Tool timed out after {tool_timeout}s", "timeout": True}

User-visible facts worth documenting:

  • Each Agent instance has its own 2-thread tool executor, reused across calls.
  • Timed-out tools return {"error": "Tool timed out after Ns", "timeout": True} rather than raising.
  • Thread names are prefixed tool-<agent_name> — useful for tracing / debugging.
  • tool_timeout (Agent-level param) is what gates this path.

3. Thread-safe plugin enable/disable

File: praisonaiagents/plugins/__init__.py

Added _plugins_lock = threading.Lock() and now wraps every read/write of the module-level _plugins_enabled and _enabled_plugin_names globals. enable() also snapshots _enabled_plugin_names under the lock before iterating, removing a TOCTOU race.

User-visible fact: plugins.enable(...), plugins.disable(...) and plugins.is_enabled(...) are now safe to call from multiple threads (e.g. FastAPI workers, background jobs, Celery tasks).


Docs gap analysis

Area Current doc Gap
Agent concurrency limits (ConcurrencyRegistry, per-agent semaphores) None No user-facing page exists. acquire_sync vs acquire rule is undocumented.
Tool execution timeout (tool_timeout, timeout result shape, executor reuse) docs/configuration/tool-config.mdx mentions tool_timeout only in passing Needs a short "how timeouts actually behave" section.
Plugin thread-safety (plugins.enable / plugins.disable / plugins.is_enabled) docs/features/plugins.mdx exists (full feature page) No thread-safety note anywhere.

No existing page needs a full rewrite — this is additive.


Proposed documentation work

A. New pagedocs/features/concurrency.mdx (primary ask)

Agent-centric page covering both (a) per-agent concurrency limits and (b) tool-execution timeout behaviour, since users reach both via the Agent class.

Placement rule (from PraisonAI AGENTS.md): MUST go under docs/features/, NEVER docs/concepts/. Add the entry to docs.json under the existing Features group.

Required structure (per AGENTS.md template — agent-centric, user-focused, beginner-friendly):

  1. Frontmatter

    ---
    title: "Concurrency"
    sidebarTitle: "Concurrency"
    description: "Limit parallel agent runs and bound tool execution time"
    icon: "gauge"
    ---
  2. One-sentence intro + hero Mermaid diagram using the standard palette (#8B0000, #189AB4, #10B981, #F59E0B, #6366F1, white text, #7C90A0 strokes). Show: User → Agent → ConcurrencyRegistry gate → Tool executor (with timeout) → Response.

  3. Quick Start with <Steps> — agent-first examples (keep imports simple: from praisonaiagents import Agent):

    • Step 1 — Limit parallel runs of an agent using ConcurrencyRegistry:
      from praisonaiagents import Agent
      from praisonaiagents.agent.concurrency import ConcurrencyRegistry
      
      registry = ConcurrencyRegistry()
      registry.set_limit("researcher", 2)  # at most 2 concurrent runs
      
      agent = Agent(name="researcher", instructions="Research topics")
      
      # Sync context
      registry.acquire_sync("researcher")
      try:
          agent.start("Research Mars exploration")
      finally:
          registry.release("researcher")
    • Step 2 — Same, async:
      await registry.acquire("researcher")
      try:
          await agent.astart("Research Mars exploration")
      finally:
          registry.release("researcher")
    • Step 3 — Bound tool time with tool_timeout:
      from praisonaiagents import Agent
      
      agent = Agent(
          name="Assistant",
          instructions="Use tools to help users",
          tools=["get_weather"],
          tool_timeout=30,  # seconds; slow tools return a timeout dict
      )
      agent.start("What's the weather in Tokyo?")
  4. How It Works — sequence diagram showing: user call → acquire_sync/acquire → agent run → tool submit to reusable ThreadPoolExecutor(max_workers=2, thread_name_prefix="tool-<name>") → either result or {"error": "...", "timeout": True}release.

  5. Sync vs Async rule (this is the behaviour change from #1459):

    • Table: context → method → what happens if you mix them.
    • Callout with <Warning>: "Calling acquire_sync() from an async context raises RuntimeError. Use await acquire() instead."
    • Show the exact error message.
  6. Tool timeout behaviour section:

    • Return shape on timeout: {"error": "Tool timed out after Ns", "timeout": True} (not an exception).
    • One executor per Agent instance (lazy), max_workers=2, thread name tool-<agent_name> — useful for log filtering.
    • Simple "which option to pick" Mermaid diagram: blocking IO tool → raise tool_timeout; CPU-bound tool → same; no timeout → unset.
  7. Common Patterns

    • Limit an agent to N parallel runs inside a FastAPI route (acquire/release in try/finally).
    • Wrap concurrency acquire in an async with helper.
    • Choosing a tool_timeout value (network tools 30–60s, local tools 5–10s).
  8. Best Practices<AccordionGroup> with 3–4 items:

    • Always release() in a finally block.
    • Don't mix sync and async acquire in the same code path.
    • Set tool_timeout whenever tools do network IO.
    • Use thread-name prefix tool-<agent> in logs to trace which agent timed out.
  9. Related<CardGroup cols={2}> linking to plugins.mdx and docs/configuration/tool-config.mdx.

B. Updatedocs/features/plugins.mdx

Add a short "Thread safety" subsection (5–10 lines) after the existing Performance block. Content:

plugins.enable(...), plugins.disable(...) and plugins.is_enabled(...) are protected by an internal lock, so they're safe to call from multiple threads — for example from a FastAPI worker pool or a Celery task. The lock also protects against time-of-check/time-of-use races during discovery.

Keep it to a single paragraph + one callout (<Note>). No new code example needed — the existing plugins.enable(["logging", "metrics"]) example is enough.

C. Updatedocs/configuration/tool-config.mdx (small)

Where tool_timeout is mentioned, add one line documenting the return shape on timeout ({"error": "...", "timeout": True}) and note that each agent has its own 2-thread executor that survives across calls. Keep it minimal; link to the new docs/features/concurrency.mdx page.

D. docs.json

Add concurrency under the existing Features group (alphabetical order, near callbacks). Do not add it to the Concepts group — that folder is human-approved only per PraisonAI AGENTS.md.


Ground-truth file list for the doc writer

  • praisonaiagents/agent/concurrency.pyConcurrencyRegistry, set_limit, acquire, acquire_sync, release
  • praisonaiagents/agent/tool_execution.py — reusable _tool_executor, tool_timeout handling
  • praisonaiagents/plugins/__init__.py_plugins_lock, enable, disable, is_enabled
  • praisonaiagents/agent/agent.pyAgent(tool_timeout=...) parameter

The documentation must reflect these files verbatim — read them before writing.


Acceptance criteria

  • New file docs/features/concurrency.mdx exists and matches the structure above
  • Hero Mermaid diagram present, standard colour palette, white text
  • All code examples run as-is with from praisonaiagents import Agent (and the one from praisonaiagents.agent.concurrency import ConcurrencyRegistry where needed)
  • Sync vs async rule documented with the exact RuntimeError message
  • Tool timeout return shape documented
  • docs/features/plugins.mdx has a short thread-safety note
  • docs/configuration/tool-config.mdx mentions the timeout return shape and links to the new page
  • docs.json updated under Features group only (not Concepts)
  • No files created or modified under docs/concepts/
  • No files created or modified under docs/js/ or docs/rust/ (auto-generated)

Source references

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaudeTrigger Claude Code analysisdocumentationImprovements or additions to documentationenhancementNew feature or requestperformance

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions