Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds support for fetching and analyzing GitHub security advisories as part of the security audit workflow. The feature allows the audit process to consider known security advisories when identifying potential security risks in repositories.
Changes:
- Adds a new MCP server and toolbox for fetching GitHub security advisories via the GitHub API
- Integrates advisory checking into the classification and audit taskflows
- Adds an optional --advisory flag to the run_audit.sh script to enable advisory fetching
- Updates the web security expert personality to include the security advisories toolbox
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| src/seclab_taskflows/toolboxes/security_advisories.yaml | New toolbox configuration for the security advisories MCP server |
| src/seclab_taskflows/taskflows/audit/fetch_security_advisories.yaml | New taskflow to fetch and review security advisories for a repository |
| src/seclab_taskflows/taskflows/audit/classify_application_local.yaml | Updated to fetch and consider security advisories from memcache when classifying security risks |
| src/seclab_taskflows/taskflows/audit/audit_issue_local_iter.yaml | Updated to fetch and consider security advisories from memcache when auditing issues |
| src/seclab_taskflows/personalities/web_application_security_expert.yaml | Added security_advisories toolbox to the expert's available tools |
| src/seclab_taskflows/mcp_servers/security_advisories.py | New MCP server implementation that fetches advisories from GitHub API, filters by date, and stores in memcache |
| scripts/audit/run_audit.sh | Added --advisory flag to optionally run the fetch_security_advisories taskflow |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """ | ||
| owner = owner.strip().lower() | ||
| repo = repo.strip().lower() | ||
| gh_token = os.getenv("GH_TOKEN", "") |
There was a problem hiding this comment.
The GH_TOKEN is retrieved from environment variables twice: once at the module level (line 23) and again in the get_security_advisories function (line 109). The function should use the module-level GH_TOKEN variable instead of calling os.getenv again. Replace line 109 with using the existing GH_TOKEN variable to avoid redundant environment variable lookups.
| gh_token = os.getenv("GH_TOKEN", "") | |
| gh_token = GH_TOKEN |
|
|
||
| ## Known Security Advisories for this Repository | ||
|
|
||
| Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo }}'). If the value in the memcache is null, clearly state so and skip advisory analysis. Otherwise, state how many advisories were found. |
There was a problem hiding this comment.
There's a potential case sensitivity issue with the memcache key. The get_security_advisories function lowercases both owner and repo (lines 107-108) before creating the key as 'security_advisories_{owner}/{repo}'. However, the prompt instructs to fetch using 'security_advisories_{{ globals.repo }}', which will preserve the original case of globals.repo. If globals.repo contains uppercase letters (e.g., "Owner/Repo"), the lookup key won't match the stored key (e.g., "security_advisories_owner/repo"). Either lowercase globals.repo in the prompt template or ensure globals.repo is always provided in lowercase format.
| Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo }}'). If the value in the memcache is null, clearly state so and skip advisory analysis. Otherwise, state how many advisories were found. | |
| Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo | lower }}'). If the value in the memcache is null, clearly state so and skip advisory analysis. Otherwise, state how many advisories were found. |
|
|
||
| seclab-taskflow-agent: | ||
| filetype: toolbox | ||
| version: 1 |
There was a problem hiding this comment.
The version field format is inconsistent with the established convention. All existing toolbox files use a quoted string format version: "1.0", but this file uses an unquoted integer version: 1. Update to match the established pattern: version: "1.0"
| version: 1 | |
| version: "1.0" |
| GH_TOKEN: "{{ env GH_TOKEN }}" | ||
| LOG_DIR: "{{ env LOG_DIR }}" | ||
| MEMCACHE_STATE_DIR: "{{ env MEMCACHE_STATE_DIR }}" |
There was a problem hiding this comment.
The environment variable syntax is incorrect. All existing toolbox files use the function syntax "{{ env('VAR_NAME') }}" with parentheses, but this file uses "{{ env VAR_NAME }}" without them. This will likely cause runtime errors when the toolbox tries to access these environment variables. Update all three env references to use the correct syntax: "{{ env('GH_TOKEN') }}", "{{ env('LOG_DIR') }}", and "{{ env('MEMCACHE_STATE_DIR') }}"
| GH_TOKEN: "{{ env GH_TOKEN }}" | |
| LOG_DIR: "{{ env LOG_DIR }}" | |
| MEMCACHE_STATE_DIR: "{{ env MEMCACHE_STATE_DIR }}" | |
| GH_TOKEN: "{{ env('GH_TOKEN') }}" | |
| LOG_DIR: "{{ env('LOG_DIR') }}" | |
| MEMCACHE_STATE_DIR: "{{ env('MEMCACHE_STATE_DIR') }}" |
| @@ -0,0 +1,150 @@ | |||
| # SPDX-FileCopyrightText: 2025 GitHub | |||
There was a problem hiding this comment.
The SPDX copyright header format is inconsistent with the established convention in the codebase. All existing Python files in mcp_servers use "SPDX-FileCopyrightText: GitHub, Inc." without a year, but this file uses "SPDX-FileCopyrightText: 2025 GitHub". Update to match the established pattern: "SPDX-FileCopyrightText: GitHub, Inc."
| # SPDX-FileCopyrightText: 2025 GitHub | |
| # SPDX-FileCopyrightText: GitHub, Inc. |
| async def call_api(url: str) -> str: | ||
| """Call the GitHub API to fetch security advisories.""" | ||
| headers = { | ||
| "Accept": "application/vnd.github+json", | ||
| "X-GitHub-Api-Version": "2022-11-28", | ||
| "Authorization": f"Bearer {GH_TOKEN}", | ||
| } | ||
|
|
||
| try: | ||
| async with httpx.AsyncClient(headers=headers) as client: | ||
| r = await client.get(url, follow_redirects=True) | ||
| r.raise_for_status() | ||
| return r.text | ||
| except httpx.RequestError as e: | ||
| return f"Request error: {e}" | ||
| except httpx.HTTPStatusError as e: | ||
| return f"HTTP error {e.response.status_code}: {e.response.text}" | ||
| except Exception as e: | ||
| return f"Error: {e}" | ||
|
|
||
|
|
There was a problem hiding this comment.
The call_api function is defined but never used in this module. The get_security_advisories function implements its own API call logic inline (lines 112-134) instead of using this helper. Either remove this unused function or refactor get_security_advisories to use it for better code maintainability and consistency.
| async def call_api(url: str) -> str: | |
| """Call the GitHub API to fetch security advisories.""" | |
| headers = { | |
| "Accept": "application/vnd.github+json", | |
| "X-GitHub-Api-Version": "2022-11-28", | |
| "Authorization": f"Bearer {GH_TOKEN}", | |
| } | |
| try: | |
| async with httpx.AsyncClient(headers=headers) as client: | |
| r = await client.get(url, follow_redirects=True) | |
| r.raise_for_status() | |
| return r.text | |
| except httpx.RequestError as e: | |
| return f"Request error: {e}" | |
| except httpx.HTTPStatusError as e: | |
| return f"HTTP error {e.response.status_code}: {e.response.text}" | |
| except Exception as e: | |
| return f"Error: {e}" |
| break | ||
| page += 1 | ||
| except httpx.HTTPStatusError as e: | ||
| return f"HTTP error {e.response.status_code}: {e.response.text}" |
There was a problem hiding this comment.
The code is missing error handling for JSON parsing failures. If the API returns invalid JSON (line 124), json.loads will raise a JSONDecodeError that is not caught by the current exception handlers. Add json.JSONDecodeError to the exception handling on line 131 or add a separate try-except block around the json.loads call.
| return f"HTTP error {e.response.status_code}: {e.response.text}" | |
| return f"HTTP error {e.response.status_code}: {e.response.text}" | |
| except json.JSONDecodeError as e: | |
| return f"JSON parse error: {e}" |
|
|
||
| ## Known Security Advisories for this Repository | ||
|
|
||
| Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo }}'). If the value in the memcache is null, clearly state so. Otherwise, state how many advisories were found. |
There was a problem hiding this comment.
There's a potential case sensitivity issue with the memcache key. The get_security_advisories function lowercases both owner and repo (lines 107-108) before creating the key as 'security_advisories_{owner}/{repo}'. However, the prompt instructs to fetch using 'security_advisories_{{ globals.repo }}', which will preserve the original case of globals.repo. If globals.repo contains uppercase letters (e.g., "Owner/Repo"), the lookup key won't match the stored key (e.g., "security_advisories_owner/repo"). Either lowercase globals.repo in the prompt template or ensure globals.repo is always provided in lowercase format.
| Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo }}'). If the value in the memcache is null, clearly state so. Otherwise, state how many advisories were found. | |
| Fetch the security advisories for {{ globals.repo }} from memcache (stored under the key 'security_advisories_{{ globals.repo | lower }}'). If the value in the memcache is null, clearly state so. Otherwise, state how many advisories were found. |
| @@ -0,0 +1,15 @@ | |||
| # SPDX-FileCopyrightText: 2025 GitHub | |||
There was a problem hiding this comment.
The SPDX copyright header format is inconsistent with the established convention in the codebase. All existing files use "SPDX-FileCopyrightText: GitHub, Inc." without a year, but this file uses "SPDX-FileCopyrightText: 2025 GitHub". Update to match the established pattern: "SPDX-FileCopyrightText: GitHub, Inc."
| # SPDX-FileCopyrightText: 2025 GitHub | |
| # SPDX-FileCopyrightText: GitHub, Inc. |
| @@ -0,0 +1,31 @@ | |||
| # SPDX-FileCopyrightText: 2025 GitHub | |||
There was a problem hiding this comment.
The SPDX copyright header format is inconsistent with the established convention in the codebase. All existing files use "SPDX-FileCopyrightText: GitHub, Inc." without a year, but this file uses "SPDX-FileCopyrightText: 2025 GitHub". Update to match the established pattern: "SPDX-FileCopyrightText: GitHub, Inc."
| # SPDX-FileCopyrightText: 2025 GitHub | |
| # SPDX-FileCopyrightText: GitHub, Inc. |
No description provided.