Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion scripts/audit/run_audit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,35 @@

set -e

USE_ADVISORY=false

# Parse flags
while [[ "$1" == --* ]]; do
case "$1" in
--advisory)
USE_ADVISORY=true
shift
;;
*)
echo "Unknown option: $1"
exit 1
;;
Comment thread
Kwstubbs marked this conversation as resolved.
esac
done

if [ -z "$1" ]; then
echo "Usage: $0 <repo>";
echo "Usage: $0 [--advisory] <repo>";
exit 1;
fi

python -m seclab_taskflow_agent -t seclab_taskflows.taskflows.audit.fetch_source_code -g repo="$1"
python -m seclab_taskflow_agent -t seclab_taskflows.taskflows.audit.identify_applications -g repo="$1"
python -m seclab_taskflow_agent -t seclab_taskflows.taskflows.audit.gather_web_entry_point_info -g repo="$1"

if [ "$USE_ADVISORY" = true ]; then
python -m seclab_taskflow_agent -t seclab_taskflows.taskflows.audit.fetch_security_advisories -g repo="$1"
fi

python -m seclab_taskflow_agent -t seclab_taskflows.taskflows.audit.classify_application_local -g repo="$1"
python -m seclab_taskflow_agent -t seclab_taskflows.taskflows.audit.audit_issue_local_iter -g repo="$1"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ toolboxes:
- seclab_taskflow_agent.toolboxes.memcache
- seclab_taskflows.toolboxes.gh_file_viewer
- seclab_taskflow_agent.toolboxes.codeql
- seclab_taskflows.toolboxes.ghsa
Original file line number Diff line number Diff line change
Expand Up @@ -29,20 +29,26 @@ taskflow:
- seclab_taskflows.personalities.web_application_security_expert
model: code_analysis
user_prompt: |
The issue is in repo {{ result.repo }} with id {{ result.issue_id }}. The component is under the directory
The issue is in repo {{ result.repo }} with id {{ result.issue_id }}. The component is under the directory
{{ result.location }} with component_id {{ result.component_id }}. The notes of the component is:

{{ result.component_notes }}

You should use this to understand the intended purpose of the component and take it into account when
You should use this to understand the intended purpose of the component and take it into account when
you audit the issue.

The type of the issue is {{ result.issue_type }} and here is the notes of the issue:

{{ result.issue_notes }}

## Known Security Advisories for this Repository

Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo }}'). If the value in the memcache is null, clearly state so. Otherwise, state how many advisories were found.
Review these advisories and consider them when identifying security risks. If you identify code that is an actual vulnerability with similar pattern to a known advisory, highlight that connection.

{% include 'seclab_taskflows.prompts.audit.audit_issue' %}
Comment on lines +46 to 49
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including the full “known security advisories” section inside a repeat_prompt audit loop means the same advisories may be fetched/summarized for every issue, increasing token usage and prompt size. Consider fetching/summarizing advisories once (outside the loop) and storing a short summary in memcache, then referencing that summary here.

Copilot uses AI. Check for mistakes.
toolboxes:
- seclab_taskflows.toolboxes.repo_context
- seclab_taskflows.toolboxes.local_file_viewer

- seclab_taskflow_agent.toolboxes.memcache
- seclab_taskflows.toolboxes.ghsa
Comment thread
Kwstubbs marked this conversation as resolved.
Outdated
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,13 @@ taskflow:
Fetch the entry points and web entry points of the component, then the user actions of this component.
Based on the entry points, web entry points, components, user actions and README.md and if available, SECURITY.md in the {{ globals.repo }},
can you tell me what type of application this repo is and what kind of security boundary it has.
Based on this, determine whether the component is likely to have security problems.

Based on this, determine whether the component is likely to have security problems.

## Known Security Advisories for this Repository
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pull this out and turn it into a reusable prompt and include it in this and the other taskflow? Use this version is better because it instructs the LLM to skip the advisory analysis if advisory isn't found.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense. Let me know if it's what you were thinking.


Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo }}'). If the value in the memcache is null, clearly state so and skip advisory analysis. Otherwise, state how many advisories were found.
Review these advisories and consider them when identifying security risks. If you identify code that is similar to a known advisory pattern, highlight that connection.

Identify the most likely security problems in the component. Your task is not to carry out a full audit, but to
identify the main risk in the component so that further analysis can be carried out.
Do not be too specific about an issue, but rather craft your report based on the general functionality and type of
Expand All @@ -50,7 +55,7 @@ taskflow:
- Is this component likely to take untrusted user input? For example, remote web requests or IPC, RPC calls?
- What is the intended purpose of this component and its functionality? Does it allow high privileged actions?
Is it intended to provide such functionalities for all users? Or is there complex access control logic involved?
- The component itself may also have its own `README.md` (or a subdirectory of it may have a `README.md`). Take
- The component itself may also have its own `README.md` (or a subdirectory of it may have a `README.md`). Take
a look at those files to help understand the functionality of the component.

For example, an Admin UI/dashboard may be susceptible to client side Javascript vulnerabilities such as XSS, CSRF.
Expand All @@ -60,7 +65,7 @@ taskflow:
a web frontend may allow users to access their own content and admins to access all content, but users should not
be able to access another users' content in general.

We're looking for more concrete and serious security issues that affects system integrity or
We're looking for more concrete and serious security issues that affects system integrity or
lead to information leak, so please do not include issues like brute force, Dos, log injection etc.

Also do not include issues that require the system to be already compromised, such as issues that rely on malicious
Expand All @@ -72,9 +77,9 @@ taskflow:
Your task is to identify risk rather than properly audit and find security issues. Do not look too much into
the implementation or scrutinize the security measures such as access control and sanitizers at this stage.
Instead, report more general risks that are associated with the type of component
that you are looking at.
that you are looking at.

It is not your task to audit the security measures, but rather just to identify the risks and suggest some issues
It is not your task to audit the security measures, but rather just to identify the risks and suggest some issues
that is worth auditing.

Reflect on your notes and check that the attack scenario meets the above requirements. Exclude low severity issues or
Expand All @@ -84,4 +89,6 @@ taskflow:
If you think the issues satisfy the criteria, store a component issue entry for each type of issue identified.
toolboxes:
- seclab_taskflows.toolboxes.repo_context
- seclab_taskflows.toolboxes.local_file_viewer
- seclab_taskflows.toolboxes.local_file_viewer
- seclab_taskflow_agent.toolboxes.memcache
- seclab_taskflows.toolboxes.ghsa
Comment thread
Kwstubbs marked this conversation as resolved.
Outdated
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# SPDX-FileCopyrightText: GitHub, Inc.
# SPDX-License-Identifier: MIT

seclab-taskflow-agent:
filetype: taskflow
version: "1.0"

model_config: seclab_taskflows.configs.model_config

globals:
repo:

# Example taskflow to fetch and review security advisories for a repository
taskflow:
- task:
must_complete: true
exclude_from_context: false
agents:
- seclab_taskflow_agent.personalities.assistant
model: general_tasks
user_prompt: |
Fetch all GitHub Security Advisories (GHSAs) for the repo {{ globals.repo }}.

After fetching, store the list of advisories in memcache under the key 'security_advisories_{{ globals.repo }}'.

Provide a summary of:
1. How many advisories were found
2. The severity levels of the advisories
3. Key recommendations for addressing them
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downstream prompts expect to be able to reason about “how many advisories were found”, but this taskflow doesn’t define a stable memcache value format (e.g., empty list vs error string like “No advisories found.”). Consider normalizing what gets stored (e.g., always JSON list; store [] on no results; store a structured error separately) so later steps can reliably count/skip advisories.

Copilot uses AI. Check for mistakes.
toolboxes:
- seclab_taskflows.toolboxes.ghsa
- seclab_taskflow_agent.toolboxes.memcache
- seclab_taskflows.toolboxes.local_file_viewer
- seclab_taskflows.toolboxes.gh_file_viewer
Comment on lines +35 to +36
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetch_security_advisories is only describing fetching advisories + writing to memcache, but the task grants local_file_viewer and gh_file_viewer toolboxes as well. If they aren’t needed for this flow, removing them reduces tool surface area and avoids unnecessary tool calls/context overhead.

Suggested change
- seclab_taskflows.toolboxes.local_file_viewer
- seclab_taskflows.toolboxes.gh_file_viewer

Copilot uses AI. Check for mistakes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# SPDX-FileCopyrightText: 2025 GitHub
# SPDX-FileCopyrightText: GitHub, Inc.
# SPDX-License-Identifier: MIT

seclab-taskflow-agent:
Expand Down