Skip to content

Fix parsing algorithm in ValueParser#522

Merged
centosinfra-prod-github-app[bot] merged 4 commits into
mainfrom
parser
Mar 11, 2026
Merged

Fix parsing algorithm in ValueParser#522
centosinfra-prod-github-app[bot] merged 4 commits into
mainfrom
parser

Conversation

@nforro
Copy link
Copy Markdown
Member

@nforro nforro commented Mar 11, 2026

No description provided.

nforro added 3 commits March 10, 2026 20:40
`ExpressionExpansion` being a subclass of `ShellExpansion` means
that `isinstance(node, ShellExpansion)` returns `True` even if node
is an instance of `ExpressionExpansion`. Fix that.

Signed-off-by: Nikola Forró <nforro@redhat.com>
Signed-off-by: Nikola Forró <nforro@redhat.com>
This fixes rather naive colon processing in bracketed expressions
and makes the parsing algorithm follow more closely what RPM does.

Signed-off-by: Nikola Forró <nforro@redhat.com>
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the core parsing mechanism for macro substitutions within the ValueParser module. The changes introduce a more structured hierarchy for macro nodes and enhance the parsing algorithm to accurately interpret various macro formats, including single-argument and nested expressions. This leads to a more robust and predictable handling of macro expansions throughout the system.

Highlights

  • Refactored Macro Node Hierarchy: The BuiltinMacro class was removed and its functionality was absorbed by a new SingleArgEnclosedMacroSubstitution class, which better represents macros with a single argument. An abstract Expansion base class was also introduced, with ShellExpansion and ExpressionExpansion now inheriting from it.
  • Improved ValueParser Algorithm: The ValueParser.parse method was significantly revised to correctly handle various macro parsing scenarios, including nested macros and those with single arguments, by refining how string literals and different macro types are identified and processed.
  • Updated Source Collection Logic: The collect_loaded_sources function in spec_parser.py was updated to recognize the new SingleArgEnclosedMacroSubstitution class when identifying %{load:...} macros, ensuring compatibility with the refactored parsing logic.
  • Enhanced Test Coverage: New test cases were added to test_value_parser.py to validate the improved parsing algorithm, specifically for complex and nested macro substitutions, ensuring the robustness of the changes.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • specfile/spec_parser.py
    • Updated imports to reflect the removal of BuiltinMacro and the addition of SingleArgEnclosedMacroSubstitution.
    • Modified collect_loaded_sources to use SingleArgEnclosedMacroSubstitution for identifying %{load:...} macros and access the correct argument.
  • specfile/value_parser.py
    • Introduced an abstract Expansion base class.
    • Refactored ShellExpansion and ExpressionExpansion to inherit from the new Expansion class.
    • Added the SingleArgEnclosedMacroSubstitution class to represent single-argument bracket-enclosed macros.
    • Removed the BuiltinMacro class.
    • Significantly revised the ValueParser.parse method to improve the parsing of various macro types, including nested and single-argument macros, and to correctly handle string literals surrounding macros.
    • Updated the ValueParser.evaluate method to recognize SingleArgEnclosedMacroSubstitution.
  • tests/unit/test_value_parser.py
    • Updated imports to align with the changes in specfile/value_parser.py.
    • Modified existing test cases and added new ones to validate the correct parsing of single-argument and nested macro substitutions.
Activity
  • The pull request was opened by nforro.
  • No human activity (comments, reviews, or progress updates) has been recorded yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the parsing algorithm in ValueParser for handling enclosed macros (%{...}). The refactoring introduces a new Expansion base class and replaces BuiltinMacro with SingleArgEnclosedMacroSubstitution, leading to a cleaner design. However, it introduces or maintains several security risks: recursive parsing for nested macros can lead to a Denial of Service via RecursionError, automatic expansion of macros can lead to arbitrary command execution if untrusted spec files are processed, and unhandled exceptions in path comparisons can cause the parser to crash. While the new parsing logic is more robust and handles several edge cases, a minor issue was identified in a new test case where the expected output appears to be incorrect.

Comment thread specfile/spec_parser.py
Comment thread specfile/value_parser.py
Comment thread tests/unit/test_value_parser.py
Signed-off-by: Nikola Forró <nforro@redhat.com>
@nforro nforro moved this from New to In review in Packit pull requests Mar 11, 2026
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 11, 2026

/gemini review

Copy link
Copy Markdown
Member

@lbarcziova lbarcziova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly refactors the ValueParser to improve robustness with invalid macro constructs and enhance code clarity through the introduction of Expansion and SingleArgEnclosedMacroSubstitution. However, it introduces critical security vulnerabilities by allowing a bypass of the Sanitizer. The parser's restrictive macro identification can be exploited by attackers to hide malicious macros (like lua, include, or load) using leading spaces or nested macros, potentially leading to Remote Code Execution (RCE) or arbitrary file access when RPM expands the unsanitized output. The parser should be updated to match RPM's liberal macro identification, and the sanitizer must recursively validate macro names. Additionally, a minor issue was found in a new test case with an incorrect expectation.

Comment thread specfile/value_parser.py
Comment thread tests/unit/test_value_parser.py
@nforro nforro added the mergeit Merge via Zuul label Mar 11, 2026
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@centosinfra-prod-github-app centosinfra-prod-github-app Bot merged commit 692a2db into main Mar 11, 2026
50 of 51 checks passed
@github-project-automation github-project-automation Bot moved this from In review to Done in Packit pull requests Mar 11, 2026
@nforro nforro deleted the parser branch March 11, 2026 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mergeit Merge via Zuul

Projects

Development

Successfully merging this pull request may close these issues.

3 participants