Skip to content

add tests and refactor API extraction and symbol generation helpers #1899#2909

Open
pranavthakur0-0 wants to merge 1 commit intomandiant:masterfrom
pranavthakur0-0:add-tests-api-extraction-1899
Open

add tests and refactor API extraction and symbol generation helpers #1899#2909
pranavthakur0-0 wants to merge 1 commit intomandiant:masterfrom
pranavthakur0-0:add-tests-api-extraction-1899

Conversation

@pranavthakur0-0
Copy link
Copy Markdown

Closes #1899

Changes

Tests (in tests/test_helpers.py)

  • Added test_is_aw_function() — 8 assertions covering A/W suffixes, edge cases
  • Added test_is_ordinal() — 5 assertions covering ordinal detection
  • Added test_trim_dll_part() — 6 assertions covering DLL stripping, ordinals, .NET
  • Added test_reformat_forwarded_export_name() — 4 assertions covering forwarded exports
  • Extended test_generate_symbols() with .drv/.so extension trimming, uppercase DLL, W-suffix

Refactoring

  • trim_dll_part(): separated :: (.NET) check into its own early return branch
  • generate_symbols(): early return for ordinals, is_ordinal() called only once

All 7 tests pass.

@google-cla
Copy link
Copy Markdown

google-cla bot commented Mar 11, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness and clarity of API extraction and symbol generation within the capa project. It introduces comprehensive unit tests for several helper functions, ensuring their correct behavior across various edge cases, and refactors key logic to improve readability and maintainability.

Highlights

  • New Unit Tests: New unit tests were added for is_aw_function, is_ordinal, trim_dll_part, and reformat_forwarded_export_name to ensure correctness of API and symbol helper functions.
  • Extended Test Coverage: The test_generate_symbols function was extended to cover additional scenarios, including .drv/.so extension trimming, uppercase DLL names, and W-suffixed functions.
  • Refactoring of trim_dll_part: The trim_dll_part function was refactored to separate the handling of .NET namespaces into its own early return branch, improving clarity and maintainability.
  • Refactoring of generate_symbols: The generate_symbols function was refactored to include an early return for ordinal symbols, optimizing its logic and ensuring is_ordinal() is called only once.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • CHANGELOG.md
    • Added a changelog entry documenting the addition of tests and refactoring of API extraction and symbol generation helpers.
  • capa/features/extractors/helpers.py
    • Refactored generate_symbols to handle ordinal imports with an early return.
    • Adjusted logic for A/W functions to ensure consistent symbol generation.
  • capa/rules/init.py
    • Refactored trim_dll_part to explicitly check for and retain .NET namespace parts.
    • Improved the logic for stripping DLL prefixes from API names.
  • tests/test_helpers.py
    • Added test_is_aw_function to verify A/W suffix detection.
    • Added test_is_ordinal to confirm correct ordinal symbol identification.
    • Extended test_generate_symbols to include tests for .drv/.so stripping, uppercase DLLs, and W-suffixed APIs.
    • Added test_trim_dll_part to validate DLL part trimming, including ordinal and .NET cases.
    • Added test_reformat_forwarded_export_name to ensure proper formatting of forwarded export names.
Activity
  • All 7 new and extended tests passed successfully.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a great improvement. It adds comprehensive tests for several helper functions and refactors generate_symbols and trim_dll_part for better clarity and performance. The new tests significantly increase coverage and confidence in the API extraction and symbol generation logic. I have one suggestion to improve the maintainability of the new tests by using pytest's parameterization feature.

Comment on lines +34 to +49
def test_is_aw_function():
# A-suffixed function
assert helpers.is_aw_function("CreateFileA") is True
# W-suffixed function
assert helpers.is_aw_function("CreateFileW") is True
# longer name ending with W
assert helpers.is_aw_function("LoadLibraryExW") is True

# does not end with A or W
assert helpers.is_aw_function("WriteFile") is False
assert helpers.is_aw_function("recv") is False

# too short (length < 2)
assert helpers.is_aw_function("A") is False
assert helpers.is_aw_function("W") is False
assert helpers.is_aw_function("") is False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new test functions like test_is_aw_function, test_is_ordinal, test_trim_dll_part, and test_reformat_forwarded_export_name are great additions. To improve their maintainability and reduce code duplication, consider using pytest.mark.parametrize. This would require adding import pytest to the file.

For example, test_is_aw_function could be refactored like this:

@pytest.mark.parametrize(
    "symbol, expected",
    [
        # A-suffixed function
        ("CreateFileA", True),
        # W-suffixed function
        ("CreateFileW", True),
        # longer name ending with W
        ("LoadLibraryExW", True),
        # does not end with A or W
        ("WriteFile", False),
        ("recv", False),
        # too short (length < 2)
        ("A", False),
        ("W", False),
        ("", False),
    ],
)
def test_is_aw_function(symbol, expected):
    assert helpers.is_aw_function(symbol) is expected

This approach makes it easier to add new test cases in the future. A similar pattern can be applied to the other new test functions, which would make the test suite more concise and easier to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tests (and refactor) API extraction and symbol generation

1 participant