Skip to content

Commit e82df67

Browse files
Merge branch 'main' into dependabot/pip/fastmcp-2.14.2
2 parents 11fe7a8 + f7313ba commit e82df67

43 files changed

Lines changed: 4334 additions & 1277 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 114 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
11
# GitHub Security Lab Taskflow Agent
22

3-
The Security Lab Taskflow Agent is an MCP enabled multi-Agent framework.
3+
The Security Lab Taskflow Agent is an MCP-enabled multi-Agent framework for
4+
declarative, YAML-driven agentic workflows.
45

5-
The Taskflow Agent is built on top of the [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/).
6+
Built on top of the [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/),
7+
it uses [Pydantic](https://docs.pydantic.dev/) for grammar validation and
8+
[Jinja2](https://jinja.palletsprojects.com/) for template rendering.
69

710
## Core Concepts
811

@@ -16,6 +19,115 @@ Agents can cooperate to complete sequences of tasks through so-called [taskflows
1619

1720
You can find a detailed overview of the taskflow grammar [here](doc/GRAMMAR.md) and example taskflows [here](examples/taskflows/).
1821

22+
## Architecture
23+
24+
```
25+
┌─────────────────────────────────────────────────────┐
26+
│ CLI (cli.py) │
27+
│ Typer-based entry point: -p, -t, -l, -g, --resume │
28+
└─────────────────────┬───────────────────────────────┘
29+
30+
┌─────────────────────▼───────────────────────────────┐
31+
│ Runner (runner.py) │
32+
│ Taskflow execution loop, model resolution, │
33+
│ template rendering, session checkpointing │
34+
└─────────────────────┬───────────────────────────────┘
35+
36+
┌─────────────────────▼───────────────────────────────┐
37+
│ MCP Lifecycle (mcp_lifecycle.py) │
38+
│ Server connection, cleanup, process management │
39+
└─────────────────────┬───────────────────────────────┘
40+
41+
┌─────────────────────▼───────────────────────────────┐
42+
│ Agent (agent.py) │
43+
│ TaskAgent wrapper, hooks, OpenAI Agents SDK bridge │
44+
└─────────────────────────────────────────────────────┘
45+
46+
Supporting modules:
47+
models.py — Pydantic v2 grammar models (validation)
48+
session.py — Task-level checkpoint / resume
49+
available_tools.py — YAML resource loader with caching
50+
template_utils.py — Jinja2 template environment
51+
mcp_utils.py — MCP client parameter resolution
52+
mcp_transport.py — MCP transport implementations (stdio, streamable)
53+
mcp_prompt.py — System prompt construction
54+
prompt_parser.py — Legacy prompt argument parser
55+
capi.py — AI API endpoint and token management
56+
path_utils.py — Platform-aware data/log directories
57+
```
58+
59+
### API Types
60+
61+
The agent supports both the **Chat Completions** and **Responses** OpenAI APIs.
62+
The API type can be configured globally or per model in a `model_config` file:
63+
64+
```yaml
65+
seclab-taskflow-agent:
66+
version: "1.0"
67+
filetype: model_config
68+
api_type: chat_completions # default for all models
69+
models:
70+
gpt_default: gpt-4.1
71+
gpt_responses: gpt-5.1
72+
model_settings:
73+
gpt_responses:
74+
api_type: responses # override for this model
75+
endpoint: https://api.githubcopilot.com
76+
token: CAPI_TOKEN # env var name containing the API key
77+
```
78+
79+
Per-model `model_settings` can include:
80+
- **`api_type`** — `"chat_completions"` (default) or `"responses"`
81+
- **`endpoint`** — API base URL override for this model
82+
- **`token`** — name of an environment variable containing the API key
83+
84+
### Session Recovery
85+
86+
Taskflow runs are automatically checkpointed at the task level. If a task
87+
fails after exhausting retries, the session is saved and can be resumed:
88+
89+
```
90+
** 🤖💾 Session saved: abc123def456
91+
** 🤖💡 Resume with: --resume abc123def456
92+
```
93+
94+
Resume from the last successful checkpoint:
95+
96+
```bash
97+
python -m seclab_taskflow_agent --resume abc123def456
98+
```
99+
100+
Failed tasks are automatically retried up to 3 times with increasing backoff
101+
before the session is saved. Session checkpoints are stored in the
102+
platform-specific application data directory.
103+
104+
### Error Output
105+
106+
By default, errors are shown as concise one-line messages. Use `--debug` (or
107+
set `TASK_AGENT_DEBUG=1`) for full tracebacks:
108+
109+
```bash
110+
# Concise (default)
111+
Error: [BadRequestError] model 'foo' not found
112+
(use --debug for full traceback)
113+
114+
# Full traceback
115+
python -m seclab_taskflow_agent --debug -t examples.taskflows.echo
116+
```
117+
118+
### MCP Environment Denylist
119+
120+
By default, MCP server subprocesses inherit the parent environment. To prevent
121+
specific variables from leaking to MCP servers, set `TASKFLOW_ENV_DENYLIST` to
122+
a comma-separated list of variable names:
123+
124+
```bash
125+
export TASKFLOW_ENV_DENYLIST="MY_SECRET_TOKEN,PRIVATE_KEY,OTHER_CREDENTIAL"
126+
```
127+
128+
Toolbox-level `env:` declarations in YAML still inject exactly what each server
129+
needs, so explicitly configured variables are unaffected.
130+
19131
## Use Cases and Examples
20132

21133
The Seclab Taskflow Agent framework was primarily designed to fit the iterative feedback loop driven work involved in Agentic security research workflows and vulnerability triage tasks.

doc/GRAMMAR.md

Lines changed: 36 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -509,4 +509,39 @@ When `gpt_latest` is used in the taskflow to specify a model, the value `gpt-5`
509509
510510
```
511511

512-
This provides a easy way to update model versions in a taskflow.
512+
This provides an easy way to update model versions in a taskflow.
513+
514+
#### Per-model settings
515+
516+
A `model_config` file can include per-model settings via `model_settings` and a
517+
global `api_type` that applies to all models unless overridden:
518+
519+
```yaml
520+
seclab-taskflow-agent:
521+
version: "1.0"
522+
filetype: model_config
523+
api_type: chat_completions # default for all models
524+
models:
525+
gpt_default: gpt-4.1
526+
gpt_responses: gpt-5.1
527+
model_settings:
528+
gpt_default:
529+
temperature: 0.7
530+
gpt_responses:
531+
api_type: responses # use the Responses API for this model
532+
endpoint: https://api.githubcopilot.com
533+
token: CAPI_TOKEN # env var name containing the API key
534+
temperature: 0.5
535+
```
536+
537+
The following keys in `model_settings` are handled by the engine and are not
538+
passed to the underlying model provider:
539+
540+
| Key | Description | Default |
541+
|-----|-------------|---------|
542+
| `api_type` | `"chat_completions"` or `"responses"` | Inherited from top-level `api_type`, or `"chat_completions"` |
543+
| `endpoint` | API base URL for this model | The global `AI_API_ENDPOINT` env var |
544+
| `token` | Name of an environment variable containing the API key | Uses `AI_API_TOKEN` / `COPILOT_TOKEN` |
545+
546+
All other keys (e.g. `temperature`, `top_p`) are passed through as model
547+
parameters to the OpenAI SDK.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# SPDX-FileCopyrightText: GitHub, Inc.
2+
# SPDX-License-Identifier: MIT
3+
4+
# Example: per-model API type and endpoint configuration.
5+
# gpt_responses uses the Responses API on the CAPI endpoint,
6+
# reading its token from the CAPI_TOKEN env var.
7+
8+
seclab-taskflow-agent:
9+
version: "1.0"
10+
filetype: model_config
11+
models:
12+
gpt_responses: gpt-5.1
13+
model_settings:
14+
gpt_responses:
15+
api_type: responses
16+
endpoint: https://api.githubcopilot.com
17+
token: CAPI_TOKEN
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# SPDX-FileCopyrightText: GitHub, Inc.
2+
# SPDX-License-Identifier: MIT
3+
4+
# Comprehensive test taskflow that exercises every grammar feature:
5+
# - model_config reference with model aliases
6+
# - globals (with CLI override via -g)
7+
# - inputs (task-level template variables)
8+
# - env (task-scoped environment variables)
9+
# - must_complete
10+
# - exclude_from_context
11+
# - max_steps
12+
# - MCP toolboxes (echo)
13+
# - shell task (run)
14+
# - repeat_prompt + async iteration
15+
# - reusable tasks (uses)
16+
# - reusable prompts ({% include %})
17+
# - agent handoffs (multi-agent)
18+
# - headless mode
19+
# - blocked_tools
20+
21+
seclab-taskflow-agent:
22+
version: "1.0"
23+
filetype: taskflow
24+
25+
model_config: examples.model_configs.model_config
26+
27+
globals:
28+
topic: fruit
29+
detail_level: brief
30+
31+
taskflow:
32+
# ---------------------------------------------------------------
33+
# Task 1: Shell task — produces a JSON array for repeat_prompt
34+
# Features: run, must_complete
35+
# ---------------------------------------------------------------
36+
- task:
37+
name: generate-items
38+
must_complete: true
39+
run: |
40+
echo '[{"name": "apple", "color": "red"}, {"name": "banana", "color": "yellow"}, {"name": "orange", "color": "orange"}]'
41+
42+
# ---------------------------------------------------------------
43+
# Task 2: Repeat prompt over shell output, async iteration
44+
# Features: repeat_prompt, async, async_limit, exclude_from_context,
45+
# model (alias), inputs, globals, env, max_steps
46+
# ---------------------------------------------------------------
47+
- task:
48+
name: describe-items
49+
repeat_prompt: true
50+
async: true
51+
async_limit: 3
52+
exclude_from_context: true
53+
must_complete: true
54+
model: gpt_default
55+
max_steps: 10
56+
agents:
57+
- examples.personalities.fruit_expert
58+
inputs:
59+
format: one-sentence
60+
env:
61+
FRUIT_MODE: "analysis"
62+
user_prompt: |
63+
The topic is {{ globals.topic }} at {{ globals.detail_level }} detail level.
64+
Describe the {{ result.name }} (which is {{ result.color }}) in {{ inputs.format }} format.
65+
66+
# ---------------------------------------------------------------
67+
# Task 3: MCP tool call with echo server
68+
# Features: toolboxes, headless, blocked_tools
69+
# ---------------------------------------------------------------
70+
- task:
71+
name: echo-test
72+
must_complete: true
73+
headless: true
74+
agents:
75+
- examples.personalities.echo
76+
user_prompt: |
77+
Echo the following message: "All {{ globals.topic }} items processed successfully"
78+
blocked_tools:
79+
- nonexistent_tool_to_test_filtering
80+
81+
# ---------------------------------------------------------------
82+
# Task 4: Reusable task via `uses`
83+
# Features: uses (inherits from single_step_taskflow)
84+
# ---------------------------------------------------------------
85+
- task:
86+
name: reusable-task
87+
uses: examples.taskflows.single_step_taskflow
88+
model: gpt_default
89+
90+
# ---------------------------------------------------------------
91+
# Task 5: Reusable prompt via {% include %}
92+
# Features: Jinja2 include directive, reusable prompts
93+
# ---------------------------------------------------------------
94+
- task:
95+
name: include-prompt
96+
agents:
97+
- examples.personalities.fruit_expert
98+
model: gpt_default
99+
max_steps: 5
100+
user_prompt: |
101+
Tell me about apples.
102+
103+
{% include 'examples.prompts.example_prompt' %}
104+
105+
Keep your answer to two sentences per fruit.
106+
107+
# ---------------------------------------------------------------
108+
# Task 6: Agent handoffs (multi-agent)
109+
# Features: multiple agents (first=primary, rest=handoff targets)
110+
# ---------------------------------------------------------------
111+
- task:
112+
name: handoff-test
113+
model: gpt_default
114+
max_steps: 15
115+
agents:
116+
- examples.personalities.fruit_expert
117+
- examples.personalities.apple_expert
118+
- examples.personalities.banana_expert
119+
- examples.personalities.orange_expert
120+
user_prompt: |
121+
You are a fruit coordinator. I need specific expert advice on each fruit.
122+
Please hand off to the apple expert for a one-sentence fact about apples,
123+
then to the banana expert for a one-sentence fact about bananas,
124+
then to the orange expert for a one-sentence fact about oranges.
125+
Each expert should provide exactly one interesting fact.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# SPDX-FileCopyrightText: GitHub, Inc.
2+
# SPDX-License-Identifier: MIT
3+
4+
# Echo taskflow using the Responses API with MCP tool calls.
5+
6+
seclab-taskflow-agent:
7+
version: "1.0"
8+
filetype: taskflow
9+
10+
model_config: examples.model_configs.responses_api
11+
12+
taskflow:
13+
- task:
14+
max_steps: 5
15+
must_complete: true
16+
agents:
17+
- examples.personalities.echo
18+
model: gpt_responses
19+
user_prompt: |
20+
Hello from the Responses API

0 commit comments

Comments
 (0)