Skip to content

loader: fix recycled TID overwriting matched calls in dynamic layout#2882

Open
devs6186 wants to merge 2 commits intomandiant:masterfrom
devs6186:fix/2619-recycled-tid-dynamic-layout
Open

loader: fix recycled TID overwriting matched calls in dynamic layout#2882
devs6186 wants to merge 2 commits intomandiant:masterfrom
devs6186:fix/2619-recycled-tid-dynamic-layout

Conversation

@devs6186
Copy link
Copy Markdown
Contributor

closes #2619

Root cause

When an OS recycles a thread ID (TID) within the same process, compute_dynamic_layout() iterated the same ThreadAddress twice. On the second encounter it executed:

calls_by_thread[t.address] = []   # overwrites the first instance's calls

The matched calls from the first thread instance were still present in matched_calls (added by the rule engine), but had been erased from the layout, causing _get_call_name() to raise:

ValueError: ('name not found for call', ...)

Additionally, if both the first and the second instance had matched calls, t.address was appended to threads_by_process[p.address] twice, creating a duplicate thread entry in the layout.

Fix

Two targeted changes in compute_dynamic_layout() (capa/loader.py):

  1. Replace the overwriting assignment with setdefault() so calls from all instances of a recycled TID accumulate under the same key:

    # before
    calls_by_thread[t.address] = []
    # after
    calls_by_thread.setdefault(t.address, [])
  2. Guard the thread-registration block with a membership check so the thread address is only added to matched_threads / threads_by_process once, regardless of how many times the TID is recycled:

    # before
    if calls_by_thread[t.address]:
        matched_threads.add(t.address)
        threads_by_process[p.address].append(t.address)
    # after
    if calls_by_thread[t.address] and t.address not in matched_threads:
        matched_threads.add(t.address)
        threads_by_process[p.address].append(t.address)

This is the source-level fix requested by the maintainer: the data is now consistent before it reaches the rendering stage, so no exception handling in the renderer is needed.

Test

tests/test_loader.py::test_compute_dynamic_layout_recycled_tid constructs a minimal mock DynamicFeatureExtractor that yields the same ThreadAddress twice (simulating TID recycling), with a matched call only in the first instance. It asserts that compute_dynamic_layout() returns a layout containing exactly one thread entry with the matched call intact.

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed
  • This submission includes AI-generated code and I have provided details in the description.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @devs6186, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the dynamic analysis loader where the recycling of thread IDs by the operating system could lead to data loss and runtime errors. Specifically, it prevents the overwriting of matched calls associated with a recycled TID and ensures that such TIDs are not registered multiple times in the layout. These changes enhance the robustness and accuracy of dynamic analysis results, particularly in environments where TIDs are frequently reused. Additionally, the pull request includes a new test to cover this specific scenario and updates the documentation to provide a clearer overview of capa's output consumption methods.

Highlights

  • Dynamic Layout Bug Fix: Resolved an issue where recycled Thread IDs (TIDs) could overwrite previously matched calls in the dynamic layout, leading to ValueError during rendering. The fix ensures that calls from all instances of a recycled TID are accumulated correctly.
  • Duplicate Thread Entry Prevention: Implemented a check to prevent duplicate thread entries in the dynamic layout when an OS recycles a TID. This ensures each unique thread address is registered only once, maintaining data consistency.
  • New Test Case: Added a dedicated test case (test_compute_dynamic_layout_recycled_tid) to validate the fix for recycled TIDs, simulating the scenario and asserting correct behavior.
  • Documentation Update: Updated the usage documentation to include a table comparing various methods for consuming capa output, such as CLI, IDA Pro, Ghidra, Binary Ninja, dynamic sandboxes, and the web UI.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • CHANGELOG.md
    • Added bug fix entry for recycled TID issue.
    • Added documentation entry for capa output consumption table.
  • capa/loader.py
    • Fixed recycled TID overwriting matched calls in dynamic layout.
    • Prevented duplicate thread entries for recycled TIDs.
    • Reformatted several function calls and import statements for improved readability.
  • doc/usage.md
    • Added a table outlining different ways to consume capa output.
  • tests/test_loader.py
    • Added a new test case to verify the fix for recycled TIDs in dynamic layout.
Activity
  • No human activity has occurred on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug where recycled thread IDs in dynamic analysis could lead to overwritten data and subsequent errors. The fix correctly uses setdefault to accumulate calls from all instances of a recycled TID and adds a check to prevent duplicate thread registration. A comprehensive test case has been added to validate the fix. The overall implementation is correct and of high quality. The PR also includes significant code formatting changes, which improve readability. For future PRs, it would be beneficial to separate such large-scale formatting changes from functional bug fixes to make reviewing easier. The documentation updates are also a welcome addition.

@devs6186 devs6186 force-pushed the fix/2619-recycled-tid-dynamic-layout branch from 72dbf97 to 29bb083 Compare February 24, 2026 20:24
Copy link
Copy Markdown
Collaborator

@mike-hunhoff mike-hunhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @devs6186, thanks for looking into the data overwriting issue. While this fix prevents matched calls from being lost when a TID is recycled, it doesn't quite address the requirement for unique tracking of distinct thread and process lifecycles. By accumulating all calls under the same ThreadAddress, the layout still fuses separate executions into a single entry, which doesn't solve the core problem described in #2361.

Because we want to avoid merging partial solutions for this component, we'd like to see a more comprehensive fix that addresses the uniqueness and de-duplication problem entirely for both thread and process lifecycles (for example, by incorporating sequence IDs or timestamps into the Address classes).

If you're interested in fully addressing these requirements, we'd love to see those updates in this PR. Alternatively, if you don't have the time or interest to take on the broader scope right now, let us know and we can close this for now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file contains many format-based changes. Please revert these if they are not necessary to pass linting checks.

@devs6186
Copy link
Copy Markdown
Contributor Author

Hi @devs6186, thanks for looking into the data overwriting issue. While this fix prevents matched calls from being lost when a TID is recycled, it doesn't quite address the requirement for unique tracking of distinct thread and process lifecycles. By accumulating all calls under the same ThreadAddress, the layout still fuses separate executions into a single entry, which doesn't solve the core problem described in #2361.

Because we want to avoid merging partial solutions for this component, we'd like to see a more comprehensive fix that addresses the uniqueness and de-duplication problem entirely for both thread and process lifecycles (for example, by incorporating sequence IDs or timestamps into the Address classes).

If you're interested in fully addressing these requirements, we'd love to see those updates in this PR. Alternatively, if you don't have the time or interest to take on the broader scope right now, let us know and we can close this for now.

i appreciate your review, i will look into it moe broadly, i wanna solve a problem completely instead of just submitting a pr, give me days time or two maximum, ill come back with a better solution. Thank you !

@devs6186 devs6186 force-pushed the fix/2619-recycled-tid-dynamic-layout branch 2 times, most recently from 944f89c to 90ba1b0 Compare March 2, 2026 14:23
…D lifecycles

Adds an optional `id` field to `ProcessAddress` and `ThreadAddress` that
sandbox backends can populate with a sandbox-specific unique identifier
(e.g. VMRay monitor_id, or a sequential counter for CAPE). When set, this
field becomes part of equality/hashing so that two process or thread
instances that share the same OS-assigned PID/TID are treated as distinct
addresses throughout capa's pipeline.

This comprehensively fixes the ValueError crash in render (mandiant#2619) by solving
the root uniqueness problem described in mandiant#2361: rather than merging recycled
lifecycles into a single entry, each instance now gets its own identity.

Changes:
- address.py: add optional `id` to ProcessAddress and ThreadAddress; update
  __eq__, __hash__, __lt__, __repr__ accordingly; backward-compatible (id=None
  by default)
- freeze/__init__.py: extend from_capa/to_capa to encode/decode the new id
  fields using extended tuple lengths; old 2/3/4-element tuples still decoded
  correctly for backward compatibility
- vmray/extractor.py: pass monitor_id as id to both ProcessAddress and
  ThreadAddress so each VMRay monitor instance is uniquely tracked
- cape/file.py: detect PID reuse via two-pass counting and assign sequential
  ids; processes with unique PIDs keep id=None (no behavior change)
- render/verbose.py: add _format_process_fields / _format_thread_fields helpers
  that include the id in rendered output when present
- tests/test_address_uniqueness.py: 35 unit tests covering identity, hashing,
  sorting, freeze roundtrip (incl. backward compat), and compute_dynamic_layout
  behavior for both recycled TIDs and recycled PIDs
@devs6186 devs6186 force-pushed the fix/2619-recycled-tid-dynamic-layout branch from 90ba1b0 to e4aa79a Compare March 2, 2026 14:25
@devs6186 devs6186 marked this pull request as ready for review March 2, 2026 14:26
Copy link
Copy Markdown
Collaborator

@mike-hunhoff mike-hunhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @devs6186 . I started a pass and quickly realized that there appears to be a lot of code added to main backwards compatibility. This change is going to be considered a breaking change requiring a major release, so let's not over complicate the code to maintan backwards compatibility. Also, please ensure that process and thread uniqueness is addressed for all supported sandboxes. Presently, it looks like you only have process and thread covered for VMRay and process covered for CAPE.

Comment on lines +49 to +50
# only assign ids when reuse is detected; otherwise keep id=None
# for backward compatibility with existing addresses and freeze files
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why the value must be None to maintain backwards compatibility? Isn't the id field being optional enough to maintain backwards compatibility. Otherwise, we should be assigning IDs moving forward.

Comment on lines +39 to +47
for process in report.behavior.processes:
addr = ProcessAddress(pid=process.process_id, ppid=process.parent_id)
yield ProcessHandle(address=addr, inner=process)
key = (process.parent_id, process.process_id)
counts[key] = counts.get(key, 0) + 1

# check for pid and ppid reuse
if addr not in seen_processes:
seen_processes[addr] = [process]
else:
logger.warning(
"pid and ppid reuse detected between process %s and process%s: %s",
process,
"es" if len(seen_processes[addr]) > 1 else "",
seen_processes[addr],
# second pass: yield handles with sequential ids for reused pairs
seq: dict[tuple[int, int], int] = {}
for process in report.behavior.processes:
key = (process.parent_id, process.process_id)
seq[key] = seq.get(key, 0) + 1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why two passes are required? vs. assigning unique IDs from the start.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about CAPE threads?

Comment on lines +94 to +97
if a.id is not None:
return cls(type=AddressType.PROCESS, value=(a.ppid, a.pid, a.id))
else:
return cls(type=AddressType.PROCESS, value=(a.ppid, a.pid))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to maintain backwards compatibility? Generally, this change will be considered a breaking change regardless, so let's not over complicate the code to main backwards compatibility.

- CAPE file.py: single pass with sequential IDs (no two-pass)
- CAPE process.py: add thread uniqueness with sequential IDs
- Drakvuf helpers.py: assign id=0 to process/thread addresses
- freeze: always include id in tuples, remove backwards-compat branching
- freeze: revert format-only changes in loads_static/loads_dynamic/main
- verbose.py: remove over-engineered pid-id display logic
- tests: update for simplified API, remove backwards-compat tests
@devs6186
Copy link
Copy Markdown
Contributor Author

thanks @mike-hunhoff, really appreciate the detailed review. pushed the changes in 2585abf addressing everything:

dropped all the backwards-compat complexity:
freeze format now always uses the full tuple with the id field included. removed all the conditional len() checks and the "if not zero then None" logic. since this is a breaking change anyway, the code is much simpler now.

CAPE processes - single pass:
replaced the two-pass count-then-assign approach with a straightforward single pass. every process gets a sequential id starting from 0, no special-casing.

CAPE threads:
added thread uniqueness handling in cape/process.py. each thread now gets a sequential id to handle recycled TIDs within the same process.

Drakvuf:
updated drakvuf/helpers.py to pass id=0 for all process and thread addresses so the id field is populated across all supported sandboxes.

reverted format-only changes:
reverted all the unnecessary line wrapping in freeze/init.py (loads_static, loads_dynamic, main) and the DynamicCallAddress.eq reformatting in address.py.

sandbox coverage now:

  • VMRay: process + thread (uses monitor_id)
  • CAPE: process + thread (sequential ids)
  • Drakvuf: process + thread (id=0)

@devs6186
Copy link
Copy Markdown
Contributor Author

Hi @mike-hunhoff

I hope you're doing well, I know this is a busy period and I genuinely appreciate the time you've already put into the project.

I wanted to send a brief follow-up on Discussion #99. The GSoC application deadline is approaching and I want to make sure I'm heading in the right direction before I finalize my proposal. I'm not asking for an exhaustive review , even a quick "this looks promising / you're off-track on X" would be enormously helpful at this stage.

Thank you

@mike-hunhoff
Copy link
Copy Markdown
Collaborator

Hi @mike-hunhoff

I hope you're doing well, I know this is a busy period and I genuinely appreciate the time you've already put into the project.

I wanted to send a brief follow-up on Discussion #99. The GSoC application deadline is approaching and I want to make sure I'm heading in the right direction before I finalize my proposal. I'm not asking for an exhaustive review , even a quick "this looks promising / you're off-track on X" would be enormously helpful at this stage.

Thank you

@devs6186 please keep all GSoC-related correspondence in the flare-gsoc discussions so we can better track them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dynamic: render: ValueError "name not found for call"

2 participants