Skip to content

affinity: bind worker memory to local NUMA node#15352

Open
ssam18 wants to merge 1 commit into
OISF:mainfrom
ssam18:numa-memory-affinity-v2
Open

affinity: bind worker memory to local NUMA node#15352
ssam18 wants to merge 1 commit into
OISF:mainfrom
ssam18:numa-memory-affinity-v2

Conversation

@ssam18
Copy link
Copy Markdown

@ssam18 ssam18 commented May 10, 2026

Make sure these boxes are checked accordingly before submitting your Pull Request.

Contribution style:

Our Contribution agreements:

Changes (if applicable):

Link to ticket: https://redmine.openinfosecfoundation.org/issues/8141

This is the v2 iteration of #15350 incorporating the maintainer review request to wrap the commit body at the project line width and to add the ticket reference inside the commit message.

Suricata already pins worker threads to NUMA-local CPUs through the existing cpu-affinity logic, but the heap allocations those workers make still go through plain malloc and land on whatever NUMA node the kernel happens to pick for the first touch. On dual socket DPDK boxes this lines up cleanly only when every worker first allocation happens after pinning, which is not always the case for shared structures created by the main thread before fan out, so cross NUMA traffic shows up as packet loss on the worker side. This change adds a new opt in threading.set-memory-affinity setting that calls numa_set_localalloc on each worker right after the CPU pin, so every subsequent allocation on that thread prefers the local node and falls back to other nodes only if the local one is full. The option is gated on libnuma at build time and is a no-op on single node systems, so default behavior is unchanged.

Replaces #15350.

Describe changes:

  • Add a new threading.set-memory-affinity bool option in runmodes that defaults to false and warns when requested without libnuma support.
  • In tm-threads, after CPU pinning in TmThreadSetupOptions, call a small helper that runs numa_set_localalloc on the worker thread when libnuma is available and more than one NUMA node is online.
  • Probe libnuma in configure.ac so HAVE_LIBNUMA is also defined for non DPDK builds and the new option works there too.
  • Document the new yaml setting in suricata.yaml.in and in the user guide.

Provide values to any of the below to override the defaults.

  • To use a Suricata-Verify or Suricata-Update pull request,
    link to the pull request in the respective _BRANCH variable.
  • Leave unused overrides blank or remove.

SV_REPO=
SV_BRANCH=
SU_REPO=
SU_BRANCH=

Suricata already pins worker threads to NUMA-local CPUs via the
existing cpu-affinity logic, but every heap allocation those workers
make still goes through plain malloc and lands on whatever NUMA node
the kernel picked for the first touch. When workers and their flow
records end up on different nodes the cross-NUMA traffic shows up as
packet loss on multi-socket boxes.

Add a new threading.set-memory-affinity option, opt-in, that calls
numa_set_localalloc() on each worker right after CPU pinning so all
subsequent allocations on that thread prefer the local node.
Requires libnuma at build time and is a no-op on single-node
systems.

Bug: OISF#8141.
@lukashino
Copy link
Copy Markdown
Contributor

Hi Samaresh,
I've looked into the change of https://redmine.openinfosecfoundation.org/issues/8141 to a review state. I see you are not assigned to the ticket itself. Are you registered on Redmine, and if so, what is your username? We need that so we can assign you a developer role.

@github-actions
Copy link
Copy Markdown

NOTE: This PR may contain new authors.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 11, 2026

Codecov Report

❌ Patch coverage is 43.33333% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.63%. Comparing base (8968b1c) to head (3c639c3).
⚠️ Report is 13 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15352      +/-   ##
==========================================
- Coverage   82.66%   82.63%   -0.04%     
==========================================
  Files         995      996       +1     
  Lines      271046   271141      +95     
==========================================
- Hits       224069   224060       -9     
- Misses      46977    47081     +104     
Flag Coverage Δ
fuzzcorpus 61.02% <22.72%> (-0.04%) ⬇️
livemode 18.41% <45.45%> (+0.03%) ⬆️
netns 22.57% <22.72%> (-0.06%) ⬇️
pcap 45.18% <22.72%> (-0.05%) ⬇️
suricata-verify 66.37% <47.36%> (-0.03%) ⬇️
unittests 58.55% <22.72%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lukashino
Copy link
Copy Markdown
Contributor

For smoothing out the further process, I am attaching also:
https://docs.suricata.io/en/latest/devguide/contributing/contribution-process.html

Signing a CLA agreement can be found at:
https://suricata.io/contribution-agreements/

@ssam18
Copy link
Copy Markdown
Author

ssam18 commented May 11, 2026

@lukashino I do have a Redmine account already, my username there is ssam18. Please assign the ticket to me when you get a chance, thanks.

@ssam18
Copy link
Copy Markdown
Author

ssam18 commented May 11, 2026

Hi Samaresh, I've looked into the change of https://redmine.openinfosecfoundation.org/issues/8141 to a review state. I see you are not assigned to the ticket itself. Are you registered on Redmine, and if so, what is your username? We need that so we can assign you a developer role.

yes, it is ssam18

@ssam18
Copy link
Copy Markdown
Author

ssam18 commented May 11, 2026

Thanks for the links @lukashino. I have already signed the OISF Contribution Agreement and I have gone through the contribution process page, so I am set on that side. Ready to keep iterating on the PR whenever you have time to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants