GitHubSecurityLab
diff --git a/‎README.md‎
Lines changed: 114 additions & 2 deletions b/‎README.md‎
Lines changed: 114 additions & 2 deletions
diff --git a/‎doc/GRAMMAR.md‎
Lines changed: 36 additions & 1 deletion b/‎doc/GRAMMAR.md‎
Lines changed: 36 additions & 1 deletion
diff --git a/‎examples/model_configs/responses_api.yaml‎
Lines changed: 17 additions & 0 deletions b/‎examples/model_configs/responses_api.yaml‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎examples/taskflows/comprehensive_test.yaml‎
Lines changed: 125 additions & 0 deletions b/‎examples/taskflows/comprehensive_test.yaml‎
Lines changed: 125 additions & 0 deletions
diff --git a/‎examples/taskflows/echo_responses_api.yaml‎
Lines changed: 20 additions & 0 deletions b/‎examples/taskflows/echo_responses_api.yaml‎
Lines changed: 20 additions & 0 deletions
@@ -1,8 +1,11 @@
 # GitHub Security Lab Taskflow Agent
 
-The Security Lab Taskflow Agent is an MCP enabled multi-Agent framework.
+The Security Lab Taskflow Agent is an MCP-enabled multi-Agent framework for
+declarative, YAML-driven agentic workflows.
 
-The Taskflow Agent is built on top of the [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/).
+Built on top of the [OpenAI Agents SDK](https://openai.github.io/openai-agents-python/),
+it uses [Pydantic](https://docs.pydantic.dev/) for grammar validation and
+[Jinja2](https://jinja.palletsprojects.com/) for template rendering.
 
 ## Core Concepts
 
@@ -16,6 +19,115 @@ Agents can cooperate to complete sequences of tasks through so-called [taskflows
 
 You can find a detailed overview of the taskflow grammar [here](doc/GRAMMAR.md) and example taskflows [here](examples/taskflows/).
 
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────┐
+│                   CLI (cli.py)                      │
+│  Typer-based entry point: -p, -t, -l, -g, --resume │
+└─────────────────────┬───────────────────────────────┘
+                      │
+┌─────────────────────▼───────────────────────────────┐
+│              Runner (runner.py)                      │
+│  Taskflow execution loop, model resolution,          │
+│  template rendering, session checkpointing           │
+└─────────────────────┬───────────────────────────────┘
+                      │
+┌─────────────────────▼───────────────────────────────┐
+│          MCP Lifecycle (mcp_lifecycle.py)            │
+│  Server connection, cleanup, process management      │
+└─────────────────────┬───────────────────────────────┘
+                      │
+┌─────────────────────▼───────────────────────────────┐
+│            Agent (agent.py)                          │
+│  TaskAgent wrapper, hooks, OpenAI Agents SDK bridge  │
+└─────────────────────────────────────────────────────┘
+
+Supporting modules:
+  models.py          — Pydantic v2 grammar models (validation)
+  session.py         — Task-level checkpoint / resume
+  available_tools.py — YAML resource loader with caching
+  template_utils.py  — Jinja2 template environment
+  mcp_utils.py       — MCP client parameter resolution
+  mcp_transport.py   — MCP transport implementations (stdio, streamable)
+  mcp_prompt.py      — System prompt construction
+  prompt_parser.py   — Legacy prompt argument parser
+  capi.py            — AI API endpoint and token management
+  path_utils.py      — Platform-aware data/log directories
+```
+
+### API Types
+
+The agent supports both the **Chat Completions** and **Responses** OpenAI APIs.
+The API type can be configured globally or per model in a `model_config` file:
+
+```yaml
+seclab-taskflow-agent:
+  version: "1.0"
+  filetype: model_config
+api_type: chat_completions        # default for all models
+models:
+  gpt_default: gpt-4.1
+  gpt_responses: gpt-5.1
+model_settings:
+  gpt_responses:
+    api_type: responses           # override for this model
+    endpoint: https://api.githubcopilot.com
+    token: CAPI_TOKEN             # env var name containing the API key
+```
+
+Per-model `model_settings` can include:
+- **`api_type`** — `"chat_completions"` (default) or `"responses"`
+- **`endpoint`** — API base URL override for this model
+- **`token`** — name of an environment variable containing the API key
+
+### Session Recovery
+
+Taskflow runs are automatically checkpointed at the task level. If a task
+fails after exhausting retries, the session is saved and can be resumed:
+
+```
+** 🤖💾 Session saved: abc123def456
+** 🤖💡 Resume with: --resume abc123def456
+```
+
+Resume from the last successful checkpoint:
+
+```bash
+python -m seclab_taskflow_agent --resume abc123def456
+```
+
+Failed tasks are automatically retried up to 3 times with increasing backoff
+before the session is saved. Session checkpoints are stored in the
+platform-specific application data directory.
+
+### Error Output
+
+By default, errors are shown as concise one-line messages. Use `--debug` (or
+set `TASK_AGENT_DEBUG=1`) for full tracebacks:
+
+```bash
+# Concise (default)
+Error: [BadRequestError] model 'foo' not found
+(use --debug for full traceback)
+
+# Full traceback
+python -m seclab_taskflow_agent --debug -t examples.taskflows.echo
+```
+
+### MCP Environment Denylist
+
+By default, MCP server subprocesses inherit the parent environment. To prevent
+specific variables from leaking to MCP servers, set `TASKFLOW_ENV_DENYLIST` to
+a comma-separated list of variable names:
+
+```bash
+export TASKFLOW_ENV_DENYLIST="MY_SECRET_TOKEN,PRIVATE_KEY,OTHER_CREDENTIAL"
+```
+
+Toolbox-level `env:` declarations in YAML still inject exactly what each server
+needs, so explicitly configured variables are unaffected.
+
 ## Use Cases and Examples
 
 The Seclab Taskflow Agent framework was primarily designed to fit the iterative feedback loop driven work involved in Agentic security research workflows and vulnerability triage tasks.
 
@@ -509,4 +509,39 @@ When `gpt_latest` is used in the taskflow to specify a model, the value `gpt-5`
 
 ```
 
-This provides a easy way to update model versions in a taskflow.
+This provides an easy way to update model versions in a taskflow.
+
+#### Per-model settings
+
+A `model_config` file can include per-model settings via `model_settings` and a
+global `api_type` that applies to all models unless overridden:
+
+```yaml
+seclab-taskflow-agent:
+  version: "1.0"
+  filetype: model_config
+api_type: chat_completions        # default for all models
+models:
+  gpt_default: gpt-4.1
+  gpt_responses: gpt-5.1
+model_settings:
+  gpt_default:
+    temperature: 0.7
+  gpt_responses:
+    api_type: responses           # use the Responses API for this model
+    endpoint: https://api.githubcopilot.com
+    token: CAPI_TOKEN             # env var name containing the API key
+    temperature: 0.5
+```
+
+The following keys in `model_settings` are handled by the engine and are not
+passed to the underlying model provider:
+
+| Key | Description | Default |
+|-----|-------------|---------|
+| `api_type` | `"chat_completions"` or `"responses"` | Inherited from top-level `api_type`, or `"chat_completions"` |
+| `endpoint` | API base URL for this model | The global `AI_API_ENDPOINT` env var |
+| `token` | Name of an environment variable containing the API key | Uses `AI_API_TOKEN` / `COPILOT_TOKEN` |
+
+All other keys (e.g. `temperature`, `top_p`) are passed through as model
+parameters to the OpenAI SDK.
@@ -0,0 +1,17 @@
+# SPDX-FileCopyrightText: GitHub, Inc.
+# SPDX-License-Identifier: MIT
+
+# Example: per-model API type and endpoint configuration.
+# gpt_responses uses the Responses API on the CAPI endpoint,
+# reading its token from the CAPI_TOKEN env var.
+
+seclab-taskflow-agent:
+  version: "1.0"
+  filetype: model_config
+models:
+  gpt_responses: gpt-5.1
+model_settings:
+  gpt_responses:
+    api_type: responses
+    endpoint: https://api.githubcopilot.com
+    token: CAPI_TOKEN
@@ -0,0 +1,125 @@
+# SPDX-FileCopyrightText: GitHub, Inc.
+# SPDX-License-Identifier: MIT
+
+# Comprehensive test taskflow that exercises every grammar feature:
+#   - model_config reference with model aliases
+#   - globals (with CLI override via -g)
+#   - inputs (task-level template variables)
+#   - env (task-scoped environment variables)
+#   - must_complete
+#   - exclude_from_context
+#   - max_steps
+#   - MCP toolboxes (echo)
+#   - shell task (run)
+#   - repeat_prompt + async iteration
+#   - reusable tasks (uses)
+#   - reusable prompts ({% include %})
+#   - agent handoffs (multi-agent)
+#   - headless mode
+#   - blocked_tools
+
+seclab-taskflow-agent:
+  version: "1.0"
+  filetype: taskflow
+
+model_config: examples.model_configs.model_config
+
+globals:
+  topic: fruit
+  detail_level: brief
+
+taskflow:
+  # ---------------------------------------------------------------
+  # Task 1: Shell task — produces a JSON array for repeat_prompt
+  # Features: run, must_complete
+  # ---------------------------------------------------------------
+  - task:
+      name: generate-items
+      must_complete: true
+      run: |
+        echo '[{"name": "apple", "color": "red"}, {"name": "banana", "color": "yellow"}, {"name": "orange", "color": "orange"}]'
+
+  # ---------------------------------------------------------------
+  # Task 2: Repeat prompt over shell output, async iteration
+  # Features: repeat_prompt, async, async_limit, exclude_from_context,
+  #           model (alias), inputs, globals, env, max_steps
+  # ---------------------------------------------------------------
+  - task:
+      name: describe-items
+      repeat_prompt: true
+      async: true
+      async_limit: 3
+      exclude_from_context: true
+      must_complete: true
+      model: gpt_default
+      max_steps: 10
+      agents:
+        - examples.personalities.fruit_expert
+      inputs:
+        format: one-sentence
+      env:
+        FRUIT_MODE: "analysis"
+      user_prompt: |
+        The topic is {{ globals.topic }} at {{ globals.detail_level }} detail level.
+        Describe the {{ result.name }} (which is {{ result.color }}) in {{ inputs.format }} format.
+
+  # ---------------------------------------------------------------
+  # Task 3: MCP tool call with echo server
+  # Features: toolboxes, headless, blocked_tools
+  # ---------------------------------------------------------------
+  - task:
+      name: echo-test
+      must_complete: true
+      headless: true
+      agents:
+        - examples.personalities.echo
+      user_prompt: |
+        Echo the following message: "All {{ globals.topic }} items processed successfully"
+      blocked_tools:
+        - nonexistent_tool_to_test_filtering
+
+  # ---------------------------------------------------------------
+  # Task 4: Reusable task via `uses`
+  # Features: uses (inherits from single_step_taskflow)
+  # ---------------------------------------------------------------
+  - task:
+      name: reusable-task
+      uses: examples.taskflows.single_step_taskflow
+      model: gpt_default
+
+  # ---------------------------------------------------------------
+  # Task 5: Reusable prompt via {% include %}
+  # Features: Jinja2 include directive, reusable prompts
+  # ---------------------------------------------------------------
+  - task:
+      name: include-prompt
+      agents:
+        - examples.personalities.fruit_expert
+      model: gpt_default
+      max_steps: 5
+      user_prompt: |
+        Tell me about apples.
+
+        {% include 'examples.prompts.example_prompt' %}
+
+        Keep your answer to two sentences per fruit.
+
+  # ---------------------------------------------------------------
+  # Task 6: Agent handoffs (multi-agent)
+  # Features: multiple agents (first=primary, rest=handoff targets)
+  # ---------------------------------------------------------------
+  - task:
+      name: handoff-test
+      model: gpt_default
+      max_steps: 15
+      agents:
+        - examples.personalities.fruit_expert
+        - examples.personalities.apple_expert
+        - examples.personalities.banana_expert
+        - examples.personalities.orange_expert
+      user_prompt: |
+        You are a fruit coordinator. I need specific expert advice on each fruit.
+        Please hand off to the apple expert for a one-sentence fact about apples,
+        then to the banana expert for a one-sentence fact about bananas,
+        then to the orange expert for a one-sentence fact about oranges.
+        Each expert should provide exactly one interesting fact.
@@ -0,0 +1,20 @@
+# SPDX-FileCopyrightText: GitHub, Inc.
+# SPDX-License-Identifier: MIT
+
+# Echo taskflow using the Responses API with MCP tool calls.
+
+seclab-taskflow-agent:
+  version: "1.0"
+  filetype: taskflow
+
+model_config: examples.model_configs.responses_api
+
+taskflow:
+  - task:
+      max_steps: 5
+      must_complete: true
+      agents:
+        - examples.personalities.echo
+      model: gpt_responses
+      user_prompt: |
+        Hello from the Responses API