Skip to content

loader: handle missing basic blocks in compute_static_layout#2888

Open
reyyanxahmed wants to merge 4 commits intomandiant:masterfrom
reyyanxahmed:fix/compute-static-layout-missing-bb-2734
Open

loader: handle missing basic blocks in compute_static_layout#2888
reyyanxahmed wants to merge 4 commits intomandiant:masterfrom
reyyanxahmed:fix/compute-static-layout-missing-bb-2734

Conversation

@reyyanxahmed
Copy link
Copy Markdown

Problem

compute_static_layout crashes with an AssertionError when using the Binary Ninja extractor:

AssertionError: assert addr in functions_by_bb

Binary Ninja may recompute its IL between the feature-extraction pass and the layout-computation pass, causing basic block boundaries to shift. A basic block address that existed when find_capabilities() discovered a match may no longer be present when compute_static_layout() re-enumerates basic blocks via extractor.get_basic_blocks().

Fix

Replace the hard assert addr in functions_by_bb with a conditional check that:

  1. Logs a warning identifying the orphaned address.
  2. Skips the missing entry with continue, preserving all other valid layout data.

This is consistent with the defensive approach taken in #2523 for the closely related issue #2516.

Testing

Added tests/test_layout.py with a unit test that reproduces the scenario using NullStaticFeatureExtractor:

  1. Creates an extractor with one function containing one basic block.
  2. Runs find_capabilities() to match a BB-scoped rule.
  3. Removes the basic block from the extractor to simulate Binary Ninja's state change.
  4. Calls compute_static_layout() and verifies it completes without error.

Fixes #2734.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the robustness of the capa loader by addressing a specific issue where compute_static_layout would crash if basic block addresses, identified during an initial feature extraction pass, became unavailable during a subsequent layout computation. This situation can arise with tools like Binary Ninja due to intermediate language recomputation. The changes implement a defensive mechanism to gracefully handle such discrepancies, logging warnings instead of asserting, and include a dedicated unit test to validate this behavior.

Highlights

  • Robustness in compute_static_layout: Modified compute_static_layout to prevent crashes when basic blocks, previously matched during feature extraction, are no longer present during layout computation, specifically addressing issues with Binary Ninja's dynamic IL recomputation.
  • Error Handling: Replaced a hard AssertionError with a conditional check that logs a warning and skips orphaned basic block addresses, ensuring the process continues gracefully.
  • New Unit Test: Added test_layout.py with a unit test that simulates the scenario of a missing basic block to verify the fix and prevent regressions.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • capa/loader.py
  • tests/test_layout.py
    • Added a new test file.
    • Implemented test_compute_static_layout_with_missing_basic_block to simulate a basic block disappearing between feature extraction and layout computation.
    • Verified that compute_static_layout completes without error and produces an empty layout in this scenario.
Activity
  • No specific activity (comments, reviews, progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

…t#2734)

Extractors such as Binary Ninja may discover basic blocks during
feature extraction that are no longer present when re-enumerating
basic blocks during layout computation, because the IL can be
recomputed between the two passes, shifting basic block boundaries.

Replace the hard assertion with a defensive check that logs a warning
and skips the missing address. This prevents the AssertionError crash
while preserving all valid layout data.

Add a unit test that reproduces the scenario with
NullStaticFeatureExtractor by removing a basic block between
find_capabilities() and compute_static_layout().

Fixes mandiant#2734.
@reyyanxahmed reyyanxahmed force-pushed the fix/compute-static-layout-missing-bb-2734 branch from b3f504b to e9b1e71 Compare March 2, 2026 13:57
@github-actions github-actions bot dismissed their stale review March 2, 2026 13:58

CHANGELOG updated or no update needed, thanks! 😄

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request resolves a crash in compute_static_layout by replacing a strict assertion with a more robust check that logs a warning and continues. Specifically, it addresses an AssertionError in compute_static_layout in capa/loader.py when using the Binary Ninja extractor, improving the robustness of the tool. A new unit test in tests/test_layout.py has been added to reproduce and verify the fix. My security audit confirmed that no new security vulnerabilities were introduced. I've also provided minor suggestions for code clarity and to remove unused code in the new test file.

Comment on lines +21 to +24
import capa.features.basicblock
import capa.features.extractors.null
from capa.features.address import AbsoluteVirtualAddress
from capa.features.extractors.base_extractor import BBHandle, SampleHashes, FunctionHandle
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are a few unused imports here. capa.features.basicblock, BBHandle, and FunctionHandle are imported but not used. It's good practice to remove them to keep the code clean.

Suggested change
import capa.features.basicblock
import capa.features.extractors.null
from capa.features.address import AbsoluteVirtualAddress
from capa.features.extractors.base_extractor import BBHandle, SampleHashes, FunctionHandle
import capa.features.extractors.null
from capa.features.address import AbsoluteVirtualAddress
from capa.features.extractors.base_extractor import SampleHashes


See #2734.
"""
# Create an extractor with two basic blocks at 0x401000 and 0x401010
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment is slightly inaccurate. It states that two basic blocks are created, but the code below only creates one.

Suggested change
# Create an extractor with two basic blocks at 0x401000 and 0x401010
# Create an extractor with one basic block at 0x401000

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Both suggestions are addressed in commit aadf32e:

Unused imports removed dropped [capa.features.basicblock], BBHandle, and FunctionHandle as suggested.
Comment fixed: changed "two basic blocks" to "one basic block at 0x401000" to accurately reflect the test setup. /gemini-code-assist can you review?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the feedback. I've reviewed the changes in commit aadf32e and can confirm that:

  • The comment in tests/test_layout.py on line 37 has been updated to accurately reflect the creation of "one basic block at 0x401000".
  • The unused imports capa.features.basicblock, BBHandle, and FunctionHandle have been successfully removed from tests/test_layout.py.

These changes resolve the previously identified issues. The new test test_compute_static_layout_with_missing_basic_block also appears to be well-structured and correctly validates the fix for the described scenario.

@reyyanxahmed
Copy link
Copy Markdown
Author

@mike-hunhoff this PR is ready for review. It fixes the Binary Ninja extractor crash in #2734 by replacing the hard assert with a defensive check, consistent with the approach in #2523. CI is green and bot feedback has been addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression: crash in compute_static_layout when using binja extractor

1 participant