Skip to content

fix(analysisinfo): detect auto-selected package from Windows analyzer logs#2937

Merged
kevoreilly merged 2 commits into
kevoreilly:masterfrom
wmetcalf:fix/analysisinfo-package-detection
Mar 16, 2026
Merged

fix(analysisinfo): detect auto-selected package from Windows analyzer logs#2937
kevoreilly merged 2 commits into
kevoreilly:masterfrom
wmetcalf:fix/analysisinfo-package-detection

Conversation

@wmetcalf
Copy link
Copy Markdown
Contributor

@wmetcalf wmetcalf commented Mar 9, 2026

Summary

  • get_package() in analysisinfo.py only searched for the Linux analyzer log format (Automatically selected analysis package) when extracting the auto-selected package name
  • The Windows analyzer uses a different format (analysis package selected:) so the package field was always empty in reports for Windows tasks where no package was explicitly configured
  • Now searches for both log formats and uses len(marker) instead of a hardcoded offset

… logs

The Windows analyzer logs the package selection as:
  INFO: analysis package selected: "pkg"
but get_package() only searched for the Linux format:
  INFO: Automatically selected analysis package "pkg"

This caused the package field to remain empty in reports for
Windows analyses where no package was explicitly specified.

Now searches for both log formats using len(marker) instead of
a hardcoded offset.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where the auto-selected analysis package was not being correctly extracted from Windows analyzer logs, leading to incomplete reporting for Windows tasks. The changes ensure that the system can now parse package information from both Windows and Linux log formats, providing more accurate and comprehensive analysis results across different operating environments.

Highlights

  • Windows Analyzer Log Support: The get_package() function in analysisinfo.py has been updated to correctly identify auto-selected analysis packages from Windows analyzer logs, which use a different format than Linux logs.
  • Dynamic Offset Calculation: The hardcoded offset used for extracting the package name has been replaced with a dynamic calculation using len(marker), improving robustness and compatibility with varying log formats.
Changelog
  • modules/processing/analysisinfo.py
    • Improved package detection logic to support Windows analyzer log format.
    • Replaced hardcoded offset with dynamic len(marker) for robustness.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully adds support for detecting the auto-selected package from Windows analyzer logs, complementing the existing Linux log format, and improves maintainability by using len(marker) instead of a hardcoded offset. However, the implementation is susceptible to log injection from the untrusted guest VM, which could allow malware to spoof the package name in the final analysis report. It is recommended to add validation or sanitization to the extracted package name to ensure it only contains expected characters. Additionally, a minor performance improvement can be made to avoid scanning the log file twice for each marker.

Comment on lines +72 to +79
for marker in (
'INFO: analysis package selected: "',
'INFO: Automatically selected analysis package "',
):
if marker in analysis_log:
idx = analysis_log.index(marker)
package = analysis_log[idx + len(marker) :].split('"', 1)[0]
break
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The get_package function extracts the analysis package name from the analysis.log file. This log's content can be influenced by an untrusted guest VM, making it vulnerable to log injection. An attacker could inject a fake log line to spoof the package name, potentially leading to data spoofing in the analysis report or Cross-Site Scripting (XSS) if the name is rendered without proper escaping. It is crucial to implement validation or sanitization on the extracted package name to mitigate this vulnerability. Additionally, the current approach of using if marker in analysis_log: followed by analysis_log.index(marker) causes the string to be scanned twice for each marker. For improved efficiency, consider using analysis_log.find(marker) instead, which avoids redundant scans.

Suggested change
for marker in (
'INFO: analysis package selected: "',
'INFO: Automatically selected analysis package "',
):
if marker in analysis_log:
idx = analysis_log.index(marker)
package = analysis_log[idx + len(marker) :].split('"', 1)[0]
break
for marker in (
'INFO: analysis package selected: "',
'INFO: Automatically selected analysis package "',
):
idx = analysis_log.find(marker)
if idx != -1:
package = analysis_log[idx + len(marker) :].split('"', 1)[0]
break

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the .find() suggestion to avoid the double scan — good catch. The log injection/XSS concern isn't applicable here though: the existing code already extracted from the same guest-controlled log in the same way, and the package name is used internally for processing, not rendered raw in any frontend context.

@kevoreilly kevoreilly merged commit 974cd99 into kevoreilly:master Mar 16, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants