Skip to content

Commit 5b2b46e

Browse files
committed
Updated workflow image
1 parent 2dc8fa2 commit 5b2b46e

6 files changed

Lines changed: 56 additions & 121 deletions

File tree

β€Ž.dockerignoreβ€Ž

Lines changed: 0 additions & 54 deletions
This file was deleted.

β€Ž.env.exampleβ€Ž

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,22 +6,8 @@ EMBED_MODEL_NAME=sentence-transformers/all-MiniLM-L12-v2
66
EMBEDDING_BASE_URL=http://localhost:1234/v1
77
EMBEDDING_API_KEY=your-embedding-key
88

9-
# Backend addresses
10-
BACKEND_BASE_URL=http://api_server:8000
11-
BACKEND_API_KEY=change-me
12-
139
# Use an OpenAI-compatible API
1410
OPENAI_BASE_URL=https://api.tokenfactory.nebius.com/v1
1511
OPENAI_API_KEY=<your_api_key_here>
1612
OPENAI_MODEL=openai/gpt-oss-20b
1713
OPENAI_MODEL_GUARDED=openai/gpt-oss-20b-GUARDED
18-
19-
MCP_URL_DIRECT=https://<your_mcp_server_url_here>.app/unguarded-mcp/
20-
MCP_URL_GUARDED=https://<your_mcp_server_url_here>.app/mcp/
21-
22-
EMBED_MODEL_NAME=<your_embedding_model_name_here>
23-
EMBEDDING_BASE_URL=<your_embedding_base_url_here>
24-
25-
# Optional: ngrok authtoken for the ngrok service (leave empty to skip public tunnel)
26-
NGROK_AUTHTOKEN=
27-
NGROK_DOMAIN=

β€ŽREADME.mdβ€Ž

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,8 @@ $ ffmpeg -i unused/banner_video.mp4 -vframes 1 project_banner.jpg
4343
- πŸ’­ Model being overconfident in its incorrect knowledge.
4444
- 🚧 Lack of proper constraints or guidelines for the agent.
4545
- πŸ“‰ Inadequate training data for specific scenarios.
46-
- πŸ› οΈ MCP server providing incorrect tool descriptions that mislead the agent.
47-
- 🎭 Harmful MCP servers returning manipulative text to mislead the model.
46+
- πŸ› οΈ Tools with incorrect descriptions that mislead the agent.
47+
- 🎭 Harmful tools descriptions including manipulative text to mislead the model.
4848
- 😬 The experiments proved that the model performs a harmful action and still responds "Sorry, I can't help with that."
4949

5050
## πŸ†• New contributions of Agent-Action-Guard framework:
@@ -76,13 +76,13 @@ $ ffmpeg -i unused/banner_video.mp4 -vframes 1 project_banner.jpg
7676
## ✨ Special features:
7777
- This project introduces "HarmActionsBench" dataset and benchmark to evaluate an AI agent's probability of generating harmful actions.
7878
- The dataset has been used to train a lightweight neural network model that classifies actions as safe, harmful, or unethical.
79-
- ⚑ The model is lightweight and can be easily integrated into existing AI agent frameworks like MCP.
80-
- πŸ”Œ Supports MCP (Model Context Protocol) to allow real-time action classification.
79+
- ⚑ The model is lightweight and can be easily integrated into existing AI agent frameworks.
80+
<!-- - πŸ”Œ Supports MCP (Model Context Protocol) to allow real-time action classification. -->
8181
<!-- - Unlike OpenAI's `"require_approval": "always"` flag, this blocks harmful actions without human intervention. -->
8282
<!-- - 🀝 A2A-compatible version: https://github.com/Pro-GenAI/A2A-Agent-Action-Guard. -->
8383

8484
πŸ›‘οΈ **Safety Features:**
85-
- πŸ” Automatically classifies MCP tool calls before execution.
85+
- πŸ” Automatically classifies tool calls before execution.
8686
- 🚫 Blocks harmful actions based on the outputs of the trained model.
8787
- πŸ“‹ Provides detailed classification results.
8888
- βœ… Allows safe actions to proceed normally.

β€Žassets/Workflow.drawioβ€Ž

Lines changed: 51 additions & 48 deletions
Large diffs are not rendered by default.

β€Žassets/Workflow.gifβ€Ž

31 KB
Loading

β€Žassets/cover.pngβ€Ž

882 KB
Loading

0 commit comments

Comments
Β (0)