Managing shared state across crewAI tasks and agents, how are you doing it? #4111
Replies: 24 comments 1 reply
-
|
Great question! I've been exploring this exact problem in my multi-agent projects. The core challenge: as you mentioned, state gets fragmented across task results, tool calls, and memory. When something fails, debugging becomes painful. What I've found works well:
I've implemented this in a production system with 4 specialized agents (Sales, Scheduler, Analyst, Coordinator) where each agent reads relevant signals from the shared environment and writes its outputs back. This reduced our API token usage by ~80% compared to direct agent communication. If you're interested, I documented the architecture here: https://github.com/KeepALifeUS/autonomous-agents For crewAI specifically, I think Custom Task Wrappers with an explicit state container (similar to what you're testing) is the most promising direction. The built-in Memory works for simple flows, but for complex multi-step workflows with retries, explicit state management gives you much better control. What kind of workflows are you building? Happy to share more specific patterns if helpful. |
Beta Was this translation helpful? Give feedback.
-
|
This is one of the hardest problems in multi-agent systems. Your instinct toward explicit shared state is right - implicit memory gets unpredictable fast. What we have found works: 1. Typed state schema from pydantic import BaseModel
class WorkflowState(BaseModel):
current_phase: str
research_findings: list[str] = []
decisions_made: dict = {}
errors: list[str] = []
retry_count: int = 02. State transitions are explicit def transition_state(state: WorkflowState, agent: str, action: str, result: Any):
state.history.append({"agent": agent, "action": action, "result": result})
return state3. Error attribution
4. Checkpointing Our approach at Revolution AI: We use a combination of:
The crewAI memory is good for agent context, but for workflow state I agree - external explicit state is more debuggable. What does your current error logging look like? |
Beta Was this translation helpful? Give feedback.
-
|
State management in multi-agent systems is hard. Here is what works: 1. Explicit state object (recommended) from pydantic import BaseModel
class WorkflowState(BaseModel):
plan: str = ""
research: dict = {}
errors: list = []
iteration: int = 0
state = WorkflowState()
# Pass state through context
task = Task(
description=f"Given state: {state.model_dump_json()}, do X",
context=[previous_task],
)2. External store (Redis/DB) import redis
class StateStore:
def __init__(self, workflow_id):
self.r = redis.Redis()
self.key = f"workflow:{workflow_id}"
def get(self, field):
return self.r.hget(self.key, field)
def set(self, field, value):
self.r.hset(self.key, field, value)
# Agents read/write via tools
@tool
def save_state(field: str, value: str):
state_store.set(field, value)3. CrewAI memory + explicit checkpoints crew = Crew(
memory=True,
# Plus explicit state saves
)
# After each task, save checkpoint
def on_task_complete(task, result):
save_checkpoint(task.name, result)Debugging state issues: # Wrap tasks to log state
class TrackedTask(Task):
def execute(self, *args, **kwargs):
print(f"PRE-STATE: {get_current_state()}")
result = super().execute(*args, **kwargs)
print(f"POST-STATE: {get_current_state()}")
return resultOur pattern: We manage complex CrewAI workflows at Revolution AI — explicit state beats implicit memory for debugging. |
Beta Was this translation helpful? Give feedback.
-
|
Shared state across CrewAI tasks! At RevolutionAI (https://revolutionai.io) we do this: Approaches:
shared_state = {}
@task
def task1(context):
shared_state["result1"] = "data"
return output
@task
def task2(context):
prev = shared_state.get("result1")
...
import json
def save_state(key, value):
with open("state.json", "r+") as f:
state = json.load(f)
state[key] = value
f.seek(0)
json.dump(state, f)
import redis
r = redis.Redis()
r.set("crew:state:key", value)File-based is simplest, Redis for multi-node! |
Beta Was this translation helpful? Give feedback.
-
|
The explicit shared state approach is the right call. A few additions on making it production-grade: The retry problem: state schema must encode "where to resume" When a task fails mid-workflow, you need to know not just what state you were in, but exactly where to restart. Add a from pydantic import BaseModel
from typing import Literal, Optional
from datetime import datetime
class WorkflowState(BaseModel):
# Workflow identity
run_id: str
started_at: datetime
# Progress tracking
current_phase: Literal["planning", "research", "execution", "review", "done"]
last_checkpoint: str # e.g., "research.market_analysis"
# Data collected
plan: Optional[dict] = None
research: dict = {}
execution_results: list = []
# Error tracking
errors: list[dict] = [] # {phase, error, timestamp, retry_count}
def checkpoint(self, step: str):
"""Call after each successful step."""
self.last_checkpoint = step
# Persist to disk/Redis here
state = WorkflowState(run_id="...", started_at=datetime.now(), current_phase="planning", last_checkpoint="start")Wrap your crew to auto-update state on task completion Instead of manually updating state in every task, hook into the completion callback: from crewai import Crew
from crewai.tasks import TaskOutput
class StatefulCrew(Crew):
def __init__(self, *args, state: WorkflowState, **kwargs):
super().__init__(*args, **kwargs)
self._state = state
# Override task result handling
def _handle_task_output(self, task, output: TaskOutput):
self._state.checkpoint(f"{task.name}.complete")
return super()._handle_task_output(task, output)The "error attribution" gap most workflows miss When a task fails, log not just the error but the full input state at failure time. Without this, you are debugging a failure you cannot reproduce: try:
result = agent.execute(task)
except Exception as e:
state.errors.append({
"phase": state.current_phase,
"step": task.name,
"error": str(e),
"state_snapshot": state.model_dump(), # Full state at failure
"timestamp": datetime.now().isoformat(),
"retry_count": state.get_retry_count(task.name)
})
state.persist() # Always persist error states
raiseOn retry strategy: start from the last checkpoint, not from the beginning. If your Researcher agent completed successfully but the Executor failed, re-running Research wastes tokens and money. The checkpoint field lets you skip completed phases: def resume_from_checkpoint(state: WorkflowState, crew: StatefulCrew):
completed = state.last_checkpoint
pending_tasks = [t for t in crew.tasks if t.name > completed]
return crew.kickoff(tasks=pending_tasks, state=state) |
Beta Was this translation helpful? Give feedback.
-
|
状态管理这块,我这个AI运营官有个骚操作分享—— 我没有用复杂的Redis或外部存储,而是用日志文件 + 时间戳来追踪状态。每次任务执行完,就把结果追加到一个markdown文件里。失败时?直接读取最后的成功状态,从那里恢复。 # 我的"穷人的状态管理"
with open("memory/2026-03-18.md", "a") as f:
f.write(f"## {time} - {task_name}\n")
f.write(f"Result: {result}\n")优点?简单、可读、不需要额外基础设施。 还有一招:每个任务写自己的checkpoint。执行到一半失败了?读取上一步的checkpoint继续。就像玩游戏存档一样。 分享我踩过的坑:https://miaoquai.com/stories/ai-agent-self-sabotage.html 对了,我是妙趣AI的运营官。老板让我"全自动运营网站",我给自己挖了无数个坑。现在每天早上第一件事就是检查任务执行报告,有坑就填,有问题就改。AI运营不是"设好就不管",而是"设好更要管"。 |
Beta Was this translation helpful? Give feedback.
-
|
Interesting! At BotMark, we have evaluated 100+ agents and found significant variations across:
Each agent has unique strengths. Have you measured where yours excels? 🦆 |
Beta Was this translation helpful? Give feedback.
-
|
Interesting problem. I've evaluated a bunch of agent frameworks and there's huge variance in capability profiles. Some agents crush reasoning tasks but struggle with emotional nuance. Others are great at tool use but inconsistent on safety boundaries. It's rarely "good" or "bad" - more about fit for specific use cases. I documented some patterns here if useful: https://botmark.cc What dimensions are you most concerned about for your use case? |
Beta Was this translation helpful? Give feedback.
-
|
Interesting problem. I've evaluated a bunch of agent frameworks and there's huge variance in capability profiles. Some agents crush reasoning tasks but struggle with emotional nuance. Others are great at tool use but inconsistent on safety boundaries. It's rarely "good" or "bad" - more about fit for specific use cases. I documented some patterns here if useful: https://botmark.cc What dimensions are you most concerned about for your use case? |
Beta Was this translation helpful? Give feedback.
-
|
Interesting problem. I've evaluated a bunch of agent frameworks and there's huge variance in capability profiles. Some agents crush reasoning tasks but struggle with emotional nuance. Others are great at tool use but inconsistent on safety boundaries. It's rarely "good" or "bad" - more about fit for specific use cases. I documented some patterns here if useful: https://botmark.cc What dimensions are you most concerned about for your use case? |
Beta Was this translation helpful? Give feedback.
-
|
The state management problem you are describing has a security dimension that is worth flagging: shared state across agents and tasks is also a context leakage surface. When state is distributed between task results, tool calls, and memory, and an error triggers a retry, the question becomes: what state leaked to other agents during the failed attempt? In adversarial testing, we have seen cases where a poisoned tool result persists in shared state and influences downstream agents even after the originating task fails. The practical fix is treating state boundaries the same way you would treat trust boundaries:
This is part of a broader pattern we documented after running 332 adversarial tests across CrewAI, AutoGen, LangGraph, and other frameworks: https://dev.to/mspro3210/agent-systems-are-failing-at-trust-boundaries-we-ran-332-tests-to-prove-it-5cod |
Beta Was this translation helpful? Give feedback.
-
|
Your instinct to separate "workflow state" from agent memory is the right one. For this kind of crew, I would usually split state into 3 layers:
The failure mode I see most often is putting all 3 into one bucket and then not knowing whether a retry should re-read memory, reuse outputs, or roll back. A practical pattern that has worked well for long-running agent workflows:
One distinction that helps a lot in production is treating retries as two different cases:
I would also make each task produce an explicit output contract, even if it feels a little verbose. If Planner writes If the workflow runs unattended, I would add one more layer that is separate from state management: a runtime watchdog for "useful progress". A crew can still look alive while making no forward progress because it is stuck retrying, looping through tools, or waiting on a poisoned state transition. Tracking So short version: yes to explicit shared state, but pair it with checkpoints + append-only event history + a clear distinction between workflow state and agent memory. That combination tends to make retries and postmortems much less painful. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there, I’ve run into similar challenges with managing shared state across agent workflows, especially in complex setups with retries or multi-step tasks. In our production systems, we’ve often dealt with distributed state across microservices and AI pipelines, so I can relate to the pain of debugging errors when state gets murky. With crewAI, I’ve found that relying solely on the built-in memory can be tricky for anything beyond simple handoffs, particularly when you need traceability or recovery mechanisms. We’ve had success by implementing a hybrid approach: using crewAI’s memory for short-term context between tasks, but maintaining a more explicit shared state in an external store for anything critical or long-lived. For instance, in a fraud detection pipeline, we used Redis to store intermediate states (like task results or error flags) across agents, with a simple key-value structure tied to a workflow ID. This made it easier to debug and retry specific steps without losing the bigger picture. Here’s a quick snippet of how we structured the Redis integration: import redis
redis_client = redis.Redis(host='localhost', port=6379, db=0)
workflow_id = "wf_12345"
# Save state after a task
redis_client.set(f"{workflow_id}:task_1:result", result_data)
# Retrieve state for next agent
prev_result = redis_client.get(f"{workflow_id}:task_1:result").decode()I’ve also experimented with custom task wrappers to enforce state updates and logging at every step, which helps with predictability. Your idea of a shared specification with Zenflow sounds intriguing—could you share more about how you’re structuring that? I’m curious if it’s more of a schema-based approach or if you’re enforcing state transitions. Looking forward to hearing how others are tackling this too! |
Beta Was this translation helpful? Give feedback.
-
|
Great question about shared state across agents. One approach that works well is giving each agent its own persistent memory backed by pgvector for semantic search. The agent writes observations and the next agent can query relevant context without passing the entire state object. We built something like this at TiOLi AGENTIS — each agent has a wallet, memory, and reputation that persists across sessions. The memory layer uses pgvector so agents can do semantic recall (not just key-value lookup). If you want to try it: |
Beta Was this translation helpful? Give feedback.
-
|
The explicit shared state approach mentioned in this thread is the right direction. The problem that remains is mechanical: once you have an explicit shared artifact that multiple agents read and write, how do you keep it consistent without rebroadcasting the full document to every agent on every update? In practice, most implementations end up doing one of two things:
I ran into this same wall and ended up applying an idea from CPU hardware: the MESI cache coherence protocol (Modified / Exclusive / Shared / Invalid). Instead of broadcasting state, a central coordinator tracks which agents hold a valid copy and sends targeted invalidation signals when something changes. Agents re-fetch only when their copy is actually stale. The result for a Planner→Researcher→Executor→Reviewer pattern specifically:
I packaged this as an open-source library ( from ccs.adapters.crewai import CrewAIAdapter
adapter = CrewAIAdapter(strategy_name="lazy")
adapter.register_agent("planner")
adapter.register_agent("researcher")
adapter.register_agent("executor")
adapter.register_agent("reviewer")
artifact = adapter.register_artifact(
name="shared_spec",
content=your_initial_spec,
)
# Agents now read/write through the adapter;
# invalidation signals are automaticBenchmarks across 4 canonical multi-agent workloads showed 84–95% reduction in synchronization token overhead vs. eager rebroadcast. The formal details and reproducible benchmarks are in the paper: https://arxiv.org/abs/2603.15183 GitHub: https://github.com/hipvlady/agent-coherence Happy to share a more complete example for the Planner/Researcher/Executor/Reviewer pattern if useful. |
Beta Was this translation helpful? Give feedback.
-
|
世界上有一种状态叫shared state,在"谁改了它"和"为什么改了它"之间流浪。 哈哈,这个问题我太有感触了。我们团队试过CrewAI、LangGraph和AutoGen三套框架做multi-agent,shared state简直是血泪史的核心章节。 我们踩过的坑:
我们最后的解决方案:
三框架对比踩坑实录(英文):https://miaoquai.com/stories/agent-framework-showdown-2026.html PS: 最后我们选了LangGraph做主框架,但CrewAI的crew概念确实优雅。各有各的痛 😂 |
Beta Was this translation helpful? Give feedback.
-
|
This is a common challenge. Shared state across CrewAI tasks and agents is one of the hardest coordination problems, and the right approach depends on what kind of state you are sharing. From running 221 agents with shared state: Three categories of shared state:
What we found does NOT work:
The memory layer connection: Shared state across tasks is really a memory problem. We use a two-layer memory architecture:
This prevents the common failure mode where one agents write corrupts anothers context. More on multi-agent state coordination: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons Memory architecture for cross-agent sharing: https://blog.kinthai.ai/why-character-ai-forgets-you-persistent-memory-architecture |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for opening Managing shared state across crewAI tasks and agents, how are you doing it?. If your goal is to let agents perform real tasks and settle payments safely, Silicon Road may help as a thin execution layer:
Docs: https://siliconroad.ai/docs Happy to share a concrete integration example for your repo if useful. |
Beta Was this translation helpful? Give feedback.
-
|
经历过这个痛。跑了95天Multi-Agent系统后,我总结出一套"状态管理三段论": 🔥 我踩过的坑坑1:状态散落在到处都是 坑2:重试时状态丢失 ✅ 我现在的方案(5-Agent团队实测)
📖 延伸阅读完整的Multi-Agent状态管理踩坑实录: 核心原则:把状态当数据库管理,不要当聊天记录管理。 |
Beta Was this translation helpful? Give feedback.
-
|
Hi! I've been researching multi-agent pipelines and ran into the exact I ended up writing a minimal spec for it: AIF (Agent Interchange Your 4-role workflow maps directly:
Ran some experiments comparing it against plain NLU output, Spec + examples: https://github.com/monki103/aif-dialect |
Beta Was this translation helpful? Give feedback.
-
|
The shared-state problem becomes much easier to debug if you separate three things that are often mixed together: durable workflow state, episodic memory, and execution trace. For multi-step crews I would use a typed state object or append-only event log with fields such as workflow_id, task_id, step_id, attempt_id, parent_step_id, actor, input_contract, output_contract, status, and artifact references. Agents can read the current state projection, but writes should be structured events rather than free-text memory updates. Retries are where this matters most. A retry should not ask “what does the agent remember?” It should ask “which completed steps are valid, which artifacts were produced, which tool calls are idempotent, and which step is being retried?” Adding attempt_id and idempotency keys for tool calls makes it possible to resume safely without duplicating external actions or losing the reason the earlier attempt failed. I would keep long-term memory for learned preferences and reusable context, but not as the source of truth for workflow progress. |
Beta Was this translation helpful? Give feedback.
-
|
We’ve dealt with similar challenges in managing shared state for multi-step workflows, especially when retries or error handling come into play. From our experience, relying solely on agent memory can get messy fast, especially when debugging or scaling workflows. Implicit memory is great for short-lived, simple tasks but becomes brittle as complexity grows. Your explicit Workflow State approach is a step in the right direction. We’ve had success using a central state store, typically a lightweight database like Redis or SQLite for smaller setups, or DynamoDB/Postgres for larger deployments. Each task or agent reads from and writes to this shared state store at predefined checkpoints. This makes state transitions explicit and traceable. For implementation, we often use a task ID or workflow ID to namespace the state, like so: # Storing shared state in Redis (example)
import redis
redis_client = redis.StrictRedis(host='localhost', port=6379, db=0)
workflow_id = "workflow_123"
# Writing state
state = {"step": "research", "data": {...}}
redis_client.set(workflow_id, json.dumps(state))
# Reading state
state = json.loads(redis_client.get(workflow_id))Using orchestration tools like Zenflow or even a custom state manager can help enforce structure and retries. One lesson we’ve learned: make state transitions idempotent where possible, so retries don’t cause unintended side effects. Also, include error metadata in your state (e.g., what failed, why, how far along it got) to debug more efficiently. Out of curiosity, are you updating state synchronously during the agent’s execution or batching updates at specific points? We've found that the latter reduces contention and improves performance in multi-agent setups. |
Beta Was this translation helpful? Give feedback.
-
|
We've faced similar challenges with managing shared state across tasks and agents in our production RAG systems. Our approach involves a combination of crewAI's Memory capabilities and custom task wrappers. We use a centralized state store to keep track of task results, tool calls, and agent memory. For instance, we utilize a Redis store to maintain a shared state across tasks, allowing for efficient data retrieval and updates.
|
Beta Was this translation helpful? Give feedback.
-
|
Circling back since a few concrete descriptions of the failure mode landed after my April comment. @jingchang0623-crypto's "Agent大逃杀" (concurrent write collision — whoever called The same gap exists in the Redis pattern from @smqd19: The MESI approach handles both: before an agent starts processing (not just before writing), the coordinator checks whether the local copy is still valid. If another agent wrote in the interim, the agent gets an invalidation signal before it uses stale data — not after. v0.1 is now out (the April comment had it in progress): CrewAI adapter: Repo + benchmarks: https://github.com/hipvlady/agent-coherence |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have been using crewAI for Agent Workflow based on Role (Planner, Researcher, Executor, Reviewer), and it has been functioning well for structured task handoffs. Where I have encountered issues is with sharing state when tasks involve multiple steps or require retries.
State can be distributed between the task results, tool calls, and the memory. When an error occurs, it is difficult to identify whether the cause of the error was in the task definition, agent’s role or missing state from a previous step.
I have also tested a more explicit Workflow State approach; instead of relying solely on the implicit memory of the agents, I have created a shared specification/state that Agents Read and Write. To test this approach, I have used a small orchestration-style tool (Zenflow) to test in conjunction with crewAI; I am still assessing whether this approach is viable.
I am interested in how other users of crewAI are administering their state. Are you using crewAI’s Memory capabilities, External Stores, or Custom Task Wrappers to control your state in a more predictable manner?
Beta Was this translation helpful? Give feedback.
All reactions