fix(analysisinfo): detect auto-selected package from Windows analyzer logs by wmetcalf · Pull Request #2937 · kevoreilly/CAPEv2

wmetcalf · 2026-03-09T15:24:14Z

Summary

get_package() in analysisinfo.py only searched for the Linux analyzer log format (Automatically selected analysis package) when extracting the auto-selected package name
The Windows analyzer uses a different format (analysis package selected:) so the package field was always empty in reports for Windows tasks where no package was explicitly configured
Now searches for both log formats and uses len(marker) instead of a hardcoded offset

… logs The Windows analyzer logs the package selection as: INFO: analysis package selected: "pkg" but get_package() only searched for the Linux format: INFO: Automatically selected analysis package "pkg" This caused the package field to remain empty in reports for Windows analyses where no package was explicitly specified. Now searches for both log formats using len(marker) instead of a hardcoded offset.

gemini-code-assist · 2026-03-09T15:24:31Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where the auto-selected analysis package was not being correctly extracted from Windows analyzer logs, leading to incomplete reporting for Windows tasks. The changes ensure that the system can now parse package information from both Windows and Linux log formats, providing more accurate and comprehensive analysis results across different operating environments.

Highlights

Windows Analyzer Log Support: The get_package() function in analysisinfo.py has been updated to correctly identify auto-selected analysis packages from Windows analyzer logs, which use a different format than Linux logs.
Dynamic Offset Calculation: The hardcoded offset used for extracting the package name has been replaced with a dynamic calculation using len(marker), improving robustness and compatibility with varying log formats.

Changelog

modules/processing/analysisinfo.py
- Improved package detection logic to support Windows analyzer log format.
- Replaced hardcoded offset with dynamic len(marker) for robustness.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request successfully adds support for detecting the auto-selected package from Windows analyzer logs, complementing the existing Linux log format, and improves maintainability by using len(marker) instead of a hardcoded offset. However, the implementation is susceptible to log injection from the untrusted guest VM, which could allow malware to spoof the package name in the final analysis report. It is recommended to add validation or sanitization to the extracted package name to ensure it only contains expected characters. Additionally, a minor performance improvement can be made to avoid scanning the log file twice for each marker.

gemini-code-assist · 2026-03-09T15:27:44Z

+                    for marker in (
+                        'INFO: analysis package selected: "',
+                        'INFO: Automatically selected analysis package "',
+                    ):
+                        if marker in analysis_log:
+                            idx = analysis_log.index(marker)
+                            package = analysis_log[idx + len(marker) :].split('"', 1)[0]
+                            break


The get_package function extracts the analysis package name from the analysis.log file. This log's content can be influenced by an untrusted guest VM, making it vulnerable to log injection. An attacker could inject a fake log line to spoof the package name, potentially leading to data spoofing in the analysis report or Cross-Site Scripting (XSS) if the name is rendered without proper escaping. It is crucial to implement validation or sanitization on the extracted package name to mitigate this vulnerability. Additionally, the current approach of using if marker in analysis_log: followed by analysis_log.index(marker) causes the string to be scanned twice for each marker. For improved efficiency, consider using analysis_log.find(marker) instead, which avoids redundant scans.

Suggested change

for marker in (

'INFO: analysis package selected: "',

'INFO: Automatically selected analysis package "',

):

if marker in analysis_log:

idx = analysis_log.index(marker)

package = analysis_log[idx + len(marker) :].split('"', 1)[0]

break

for marker in (

'INFO: analysis package selected: "',

'INFO: Automatically selected analysis package "',

):

idx = analysis_log.find(marker)

if idx != -1:

package = analysis_log[idx + len(marker) :].split('"', 1)[0]

break

Applied the .find() suggestion to avoid the double scan — good catch. The log injection/XSS concern isn't applicable here though: the existing code already extracted from the same guest-controlled log in the same way, and the package name is used internally for processing, not rendered raw in any frontend context.

gemini-code-assist Bot reviewed Mar 9, 2026

View reviewed changes

refactor: use str.find() to avoid double scan of analysis log

ca5fe67

kevoreilly merged commit 974cd99 into kevoreilly:master Mar 16, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(analysisinfo): detect auto-selected package from Windows analyzer logs#2937

fix(analysisinfo): detect auto-selected package from Windows analyzer logs#2937
kevoreilly merged 2 commits into
kevoreilly:masterfrom
wmetcalf:fix/analysisinfo-package-detection

wmetcalf commented Mar 9, 2026

Uh oh!

gemini-code-assist Bot commented Mar 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 9, 2026

Uh oh!

wmetcalf Mar 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wmetcalf commented Mar 9, 2026

Summary

Uh oh!

gemini-code-assist Bot commented Mar 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

wmetcalf Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants