feat: add governance policy framework for ungoverned call sites (OWASP Agentic Top 10)#5281
Conversation
…P Agentic Top 10) Addresses issue #5280 - Security audit identified 266 ungoverned call sites that could benefit from governance checks per OWASP Agentic Top 10 standards. Changes: - Add security/governance.py with SubprocessPolicy, HttpPolicy, ToolPolicy, GovernanceConfig classes supporting allowlist/blocklist and custom validators - Integrate governance into SecurityConfig for crew-level configuration - Add subprocess governance check in agent _validate_docker_installation - Add tool governance checks in execute_tool_and_check_finality (sync/async) - Add tool governance checks in crew_agent_executor native tool call path - Export governance types from security module - Add 42 comprehensive tests covering all policy types and integration points Governance is permissive by default (allows all) to maintain backward compatibility. Users can configure policies to restrict operations. Co-Authored-By: João <joao@crewai.com>
|
Prompt hidden (unlisted session) |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Co-Authored-By: João <joao@crewai.com>
|
|
||
| error = exc_info.value | ||
| assert error.category == "http" | ||
| assert "evil.com" in error.detail |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 days ago
In general, to avoid incomplete URL substring sanitization issues, code should parse URLs and examine structured components (such as hostname) rather than using substring checks on the raw URL; tests should likewise assert on well-defined structured values or exact messages, instead of substring matches that look like domain checks.
In this case, the best fix without changing functionality is to replace the substring assertion with a more specific equality (or startswith) assertion against the exact message format we expect from GovernanceError. The goal of the test is simply to ensure the blocked domain appears in the error detail. Assuming HttpPolicy formats the message as something like "Domain 'evil.com' is blocked by policy", we can assert equality with that full string. This avoids generic substring searches that the analyzer flags, while still verifying that evil.com is present in the detail exactly as intended. Concretely, in lib/crewai/tests/security/test_governance.py, within TestHttpPolicy.test_governance_error_attributes, replace assert "evil.com" in error.detail with an assertion that error.detail equals (or starts with) the expected message string (for example, assert error.detail == "Domain 'evil.com' is blocked by policy"). No new imports or helpers are required.
| @@ -201,7 +201,7 @@ | ||
|
|
||
| error = exc_info.value | ||
| assert error.category == "http" | ||
| assert "evil.com" in error.detail | ||
| assert error.detail == "Domain 'evil.com' is blocked by policy" | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- |
|
|
||
| assert restored.governance is not None | ||
| assert "rm" in restored.governance.subprocess_policy.blocked_commands | ||
| assert "evil.com" in restored.governance.http_policy.blocked_domains |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
Copilot Autofix
AI 9 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 27d5790. Configure here.
| "shell=True is not permitted by the subprocess policy.", | ||
| ) | ||
|
|
||
| cmd_basename = command[0].rsplit("/", 1)[-1] |
There was a problem hiding this comment.
Subprocess command basename extraction fails on Windows paths
Medium Severity
The basename extraction command[0].rsplit("/", 1)[-1] only splits on forward slashes and won't correctly extract the command name from Windows-style paths (e.g., C:\Program Files\Docker\docker.exe). This means blocked_commands=["docker"] would fail to match on Windows, creating a blocklist bypass. Using os.path.basename would handle both Unix and Windows path separators correctly.
Reviewed by Cursor Bugbot for commit 27d5790. Configure here.
|
|
||
| if hook_blocked: | ||
| if governance_blocked: | ||
| result = result # already set above |
There was a problem hiding this comment.
Dead code no-op self-assignment in governance branch
Low Severity
The line result = result is a self-assignment no-op that does nothing. This dead code makes the control flow harder to understand. A pass statement or a bare comment would more clearly communicate the intent that the governance-blocked result was already set above.
Reviewed by Cursor Bugbot for commit 27d5790. Configure here.


Summary
Addresses #5280 — security audit identified ungoverned call sites (subprocess, HTTP, tool invocations) lacking approval gates per OWASP Agentic Top 10.
This PR introduces a governance policy framework that lets users define allowlist/blocklist rules and custom validators for three categories:
Policies are aggregated into
GovernanceConfig, embedded inSecurityConfig, and enforced at:Agent._validate_docker_installation()(subprocess governance)execute_tool_and_check_finality()/aexecute_tool_and_check_finality()(tool governance)CrewAgentExecutor._execute_single_native_tool_call()(native tool governance)Governance is permissive by default — no existing behavior changes unless the user explicitly configures policies.
42 new tests cover all policy types, serialization, SecurityConfig integration, and the agent/tool execution paths.
Review & Testing Checklist for Human
crew_agent_executor.pycontrol flow: The hook/governance blocking logic was restructured (governance check → hook check → execution). Confirm thegovernance_blocked/hook_blocked/max_usage_reachedbranching is correct and doesn't skip hooks or tool execution unexpectedly. Note theresult = resultno-op on the governance-blocked branch.HttpPolicyandvalidate_http()exist but no actual HTTP call site (httpx,requests) is wrapped with governance checks yet. Decide if this is acceptable as a foundation or if at least one integration point is needed._validate_docker_installationis governed. The audit identified ~69 subprocess sites. Verify this is an acceptable starting point.custom_validatorserialization: Custom validators are excluded from Pydantic serialization (exclude=True), so they won't surviveto_dict()/from_dict()round-trips. Confirm this is acceptable.uv run pytest tests -vv) to verify no regressions beyond the new governance tests.Notes
GovernanceErrorexception carries.categoryand.detailattributes for programmatic handling.ToolResultmessage (not an exception) to allow the agent to recover gracefully.hasattr(crew, "security_config")guards are used at integration points to handle cases where crew may be a mock or None.Link to Devin session: https://app.devin.ai/sessions/64ab5aab34d140128f9dd718388da42f
Note
Medium Risk
Touches core agent/tool execution paths by adding pre-execution governance checks, which could block previously-allowed subprocess/tool calls when configured. Default policies are permissive, but control-flow changes in native tool execution and new serialization fields warrant review.
Overview
Introduces a security governance policy framework (
SubprocessPolicy,HttpPolicy,ToolPolicy) aggregated underGovernanceConfigand embedded intoSecurityConfig(includingto_dict/from_dictsupport and public exports viacrewai.security).Enforces governance at key call sites:
Agent._validate_docker_installation()now validates thedocker infosubprocess before running, and tool execution now checksvalidate_tool()before running tools in both the ReAct tool path (execute_tool_and_check_finality/aexecute_tool_and_check_finality) and native tool-calling (CrewAgentExecutor._execute_single_native_tool_call, skipping hooks when governance blocks).Adds a comprehensive new test suite (
tests/security/test_governance.py) covering policy behavior, serialization/integration withSecurityConfig, and enforcement in the agent subprocess and tool execution paths.Reviewed by Cursor Bugbot for commit 27d5790. Bugbot is set up for automated code reviews on this repo. Configure here.