Skip to content

Commit b8aafc6

Browse files
committed
Add SAST container profile
- seclab-shell-sast image extends base with semgrep, pyan3, universal-ctags, GNU global, cscope, graphviz, ripgrep, fd, tree - Toolbox YAML with server_prompt documenting Python and C call graph workflows - Demo taskflow: tree, fd, semgrep, ctags, pyan3, gtags then summarise findings - Runner generates a demo Python file with a shell=True anti-pattern if workspace is empty, so semgrep has something to find out of the box - build_container_images.sh and run_container_shell_demo.sh updated for sast target - test_toolbox_yaml_valid_sast added (16/16 passing)
1 parent c49cf62 commit b8aafc6

7 files changed

Lines changed: 196 additions & 4 deletions

File tree

scripts/build_container_images.sh

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,11 @@ build_network() {
3030
docker build -t seclab-shell-network-analysis:latest "${CONTAINERS_DIR}/network_analysis/"
3131
}
3232

33+
build_sast() {
34+
echo "Building seclab-shell-sast..."
35+
docker build -t seclab-shell-sast:latest "${CONTAINERS_DIR}/sast/"
36+
}
37+
3338
target="${1:-all}"
3439

3540
case "$target" in
@@ -44,14 +49,19 @@ case "$target" in
4449
build_base
4550
build_network
4651
;;
52+
sast)
53+
build_base
54+
build_sast
55+
;;
4756
all)
4857
build_base
4958
build_malware
5059
build_network
60+
build_sast
5161
;;
5262
*)
5363
echo "Unknown target: $target" >&2
54-
echo "Usage: $0 [base|malware|network|all]" >&2
64+
echo "Usage: $0 [base|malware|network|sast|all]" >&2
5565
exit 1
5666
;;
5767
esac

scripts/run_container_shell_demo.sh

Lines changed: 49 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
# ./scripts/run_container_shell_demo.sh base [workspace_dir]
1010
# ./scripts/run_container_shell_demo.sh malware [workspace_dir] [target_filename]
1111
# ./scripts/run_container_shell_demo.sh network [workspace_dir] [capture_filename]
12+
# ./scripts/run_container_shell_demo.sh sast [workspace_dir] [target]
1213
#
1314
# If workspace_dir is omitted a temporary directory is used.
1415
# Requires AI_API_TOKEN to be set in the environment.
@@ -27,7 +28,7 @@ fi
2728

2829
demo="${1:-}"
2930
if [ -z "$demo" ]; then
30-
echo "Usage: $0 <base|malware|network> [workspace_dir] [target]" >&2
31+
echo "Usage: $0 <base|malware|network|sast> [workspace_dir] [target]" >&2
3132
exit 1
3233
fi
3334

@@ -72,8 +73,54 @@ case "$demo" in
7273
-t seclab_taskflows.taskflows.container_shell.demo_network_analysis \
7374
-g capture="$capture"
7475
;;
76+
sast)
77+
target="${3:-.}"
78+
if [ ! -d "$workspace" ] && [ ! -f "${workspace}/${target}" ]; then
79+
echo "No source found at ${workspace}/${target}" >&2
80+
echo "Provide a source directory or file in workspace_dir." >&2
81+
exit 1
82+
fi
83+
if [ "$target" = "." ] && [ -z "$(ls -A "$workspace" 2>/dev/null)" ]; then
84+
echo "Generating demo Python source in ${workspace}"
85+
cat > "${workspace}/demo.py" <<'PYEOF'
86+
import os
87+
import subprocess
88+
89+
90+
def read_config(path):
91+
with open(path) as f:
92+
return f.read()
93+
94+
95+
def run_command(cmd):
96+
# intentional anti-pattern for demo purposes
97+
return subprocess.run(cmd, shell=True, capture_output=True, text=True)
98+
99+
100+
def process_input(user_input):
101+
result = run_command(f"echo {user_input}")
102+
return result.stdout
103+
104+
105+
def main():
106+
config = read_config("/etc/demo.conf") if os.path.exists("/etc/demo.conf") else ""
107+
output = process_input("hello world")
108+
print(config, output)
109+
110+
111+
if __name__ == "__main__":
112+
main()
113+
PYEOF
114+
target="demo.py"
115+
fi
116+
CONTAINER_WORKSPACE="$workspace" \
117+
LOG_DIR="${__root}/logs" \
118+
python -m seclab_taskflow_agent \
119+
-t seclab_taskflows.taskflows.container_shell.demo_sast \
120+
-g target="$target"
121+
;;
75122
*)
76-
echo "Unknown demo: $demo. Choose base, malware, or network." >&2
123+
echo "Unknown demo: $demo. Choose base, malware, network, or sast." >&2
77124
exit 1
78125
;;
79126
esac
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# SPDX-FileCopyrightText: GitHub, Inc.
2+
# SPDX-License-Identifier: MIT
3+
4+
FROM seclab-shell-base:latest
5+
RUN apt-get update && apt-get install -y --no-install-recommends \
6+
universal-ctags global cscope ripgrep fd-find graphviz tree \
7+
&& ln -s /usr/bin/fdfind /usr/local/bin/fd \
8+
&& rm -rf /var/lib/apt/lists/*
9+
RUN pip3 install --no-cache-dir --break-system-packages semgrep pyan3

src/seclab_taskflows/taskflows/container_shell/README.md

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Runs arbitrary CLI commands inside an isolated Docker container. One container
44
per MCP server process — started on the first `shell_exec` call, stopped on
55
exit. An optional host directory is mounted at `/workspace` inside the container.
66

7-
Three container profiles are provided. Each has its own Dockerfile, toolbox
7+
Four container profiles are provided. Each has its own Dockerfile, toolbox
88
YAML, and demo taskflow.
99

1010
## Profiles
@@ -21,6 +21,10 @@ yara, exiftool, checksec, capstone, pwntools, volatility3.
2121
Packet capture analysis and network recon. Extends base with nmap, tcpdump,
2222
tshark, netcat, dig, jq, httpie.
2323

24+
**sast** (`seclab-shell-sast:latest`)
25+
Static analysis and code exploration. Extends base with semgrep, pyan3,
26+
universal-ctags, GNU global, cscope, graphviz, ripgrep, fd, tree.
27+
2428
## Building the images
2529

2630
Run from the repository root:
@@ -35,6 +39,7 @@ To build a single profile (the base image is always built first when needed):
3539
./scripts/build_container_images.sh base
3640
./scripts/build_container_images.sh malware
3741
./scripts/build_container_images.sh network
42+
./scripts/build_container_images.sh sast
3843
```
3944

4045
Images only need to be rebuilt when a Dockerfile changes.
@@ -79,6 +84,22 @@ CONTAINER_WORKSPACE=/tmp/captures python -m seclab_taskflow_agent \
7984
-g capture=sample.pcap
8085
```
8186

87+
**SAST demo** — static analysis and call graph extraction for a source repo:
88+
89+
```
90+
CONTAINER_WORKSPACE=/path/to/src python -m seclab_taskflow_agent \
91+
-t seclab_taskflows.taskflows.container_shell.demo_sast
92+
```
93+
94+
If no source is present the runner script generates a demo Python file with a
95+
shell-injection anti-pattern so semgrep has something to find:
96+
97+
```
98+
./scripts/run_container_shell_demo.sh sast
99+
```
100+
101+
Override the analysis target with `-g target=<path>` (relative to /workspace).
102+
82103
## Using container_shell in your own taskflows
83104

84105
Reference the appropriate toolbox and set `CONTAINER_WORKSPACE` in `env`:
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# SPDX-FileCopyrightText: GitHub, Inc.
2+
# SPDX-License-Identifier: MIT
3+
4+
# Demo: SAST container shell.
5+
# Analyses source code in CONTAINER_WORKSPACE using semgrep, ctags, and pyan3.
6+
# Example:
7+
# CONTAINER_WORKSPACE=/path/to/src python -m seclab_taskflow_agent \
8+
# -t seclab_taskflows.taskflows.container_shell.demo_sast \
9+
# -g target=mymodule.py
10+
11+
seclab-taskflow-agent:
12+
filetype: taskflow
13+
version: "1.0"
14+
15+
globals:
16+
target: "."
17+
18+
taskflow:
19+
- task:
20+
must_complete: true
21+
headless: true
22+
agents:
23+
- seclab_taskflow_agent.personalities.assistant
24+
toolboxes:
25+
- seclab_taskflows.toolboxes.container_shell_sast
26+
env:
27+
CONTAINER_WORKSPACE: "{{ env('CONTAINER_WORKSPACE') }}"
28+
user_prompt: |
29+
Perform static analysis on the source code at /workspace/{{ globals.target }}.
30+
31+
1. Directory overview:
32+
`tree /workspace/{{ globals.target }} 2>/dev/null | head -40`
33+
34+
2. Locate source files:
35+
`fd -e py -e c -e h /workspace/{{ globals.target }} | head -30`
36+
37+
3. Security scan with semgrep (report findings only, suppress progress):
38+
`semgrep scan --config=auto --quiet /workspace/{{ globals.target }} 2>/dev/null`
39+
If semgrep requires network access and it is unavailable, note that and skip.
40+
41+
4. Symbol index with ctags:
42+
`ctags -R --fields=+ne -f /tmp/tags /workspace/{{ globals.target }} && grep -v "^!" /tmp/tags | head -40`
43+
44+
5. Call graph (Python only — skip if no .py files):
45+
`pyan3 $(fd -e py /workspace/{{ globals.target }} | tr '\n' ' ') --dot --no-defines 2>/dev/null | head -60`
46+
47+
6. Cross-reference entry points with GNU global:
48+
`cd /workspace/{{ globals.target }} && gtags --compact -v 2>/dev/null && global -x main 2>/dev/null || echo "gtags: no entry point found"`
49+
50+
Based on the above, summarize:
51+
- Security findings from semgrep (severity, rule, file, line)
52+
- Key functions/classes and their call relationships
53+
- Identified entry points and reachable sensitive operations
54+
- Areas recommended for deeper review
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# SPDX-FileCopyrightText: GitHub, Inc.
2+
# SPDX-License-Identifier: MIT
3+
4+
seclab-taskflow-agent:
5+
filetype: toolbox
6+
version: "1.0"
7+
8+
server_params:
9+
kind: stdio
10+
command: python
11+
args: ["-m", "seclab_taskflows.mcp_servers.container_shell"]
12+
env:
13+
CONTAINER_IMAGE: "seclab-shell-sast:latest"
14+
CONTAINER_WORKSPACE: "{{ env('CONTAINER_WORKSPACE', required=False) }}"
15+
CONTAINER_TIMEOUT: "{{ env('CONTAINER_TIMEOUT', '60') }}"
16+
LOG_DIR: "{{ env('LOG_DIR') }}"
17+
18+
confirm:
19+
- shell_exec
20+
21+
server_prompt: |
22+
## Container Shell (SAST)
23+
You have access to an isolated Docker container for static analysis and code exploration.
24+
Source code is available at /workspace. All tools run inside the container.
25+
26+
Available tools:
27+
- semgrep — SAST scanner; run with `semgrep scan --config=auto /workspace`
28+
- pyan3 — Python static call graph generator (dot output for graphviz)
29+
- ctags — Multi-language tag/symbol index (`ctags -R .`)
30+
- global (gtags/global) — Source code cross-reference: `gtags` to index, `global -r <sym>` to find callers
31+
- cscope — C/C++ cross-reference browser (`cscope -R -b` to build, `cscope -R -L -2 <sym>` for callers)
32+
- graphviz (dot) — Render dot graphs: `dot -Tpng callgraph.dot -o callgraph.png`
33+
- ripgrep (rg) — Fast regex search: `rg <pattern> /workspace`
34+
- fd — Fast file finder: `fd -e py /workspace`
35+
- tree — Directory structure: `tree /workspace`
36+
- grep, sed, awk, find — Standard code exploration
37+
38+
Typical call graph workflow (Python):
39+
1. `fd -e py /workspace | head -30` — find Python files
40+
2. `pyan3 $(fd -e py /workspace | tr '\n' ' ') --dot --no-defines > /tmp/cg.dot`
41+
3. `dot -Tsvg /tmp/cg.dot -o /tmp/cg.svg && cat /tmp/cg.svg | head -100`
42+
43+
Typical call graph workflow (C):
44+
1. `cd /workspace && ctags -R --fields=+ne .`
45+
2. `cd /workspace && cscope -R -b && cscope -R -L -2 <function>`

tests/test_container_shell.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -194,3 +194,9 @@ def test_toolbox_yaml_valid_network(self):
194194
result = tools.get_toolbox("seclab_taskflows.toolboxes.container_shell_network_analysis")
195195
assert result is not None
196196
assert result["seclab-taskflow-agent"]["filetype"] == "toolbox"
197+
198+
def test_toolbox_yaml_valid_sast(self):
199+
tools = AvailableTools()
200+
result = tools.get_toolbox("seclab_taskflows.toolboxes.container_shell_sast")
201+
assert result is not None
202+
assert result["seclab-taskflow-agent"]["filetype"] == "toolbox"

0 commit comments

Comments
 (0)