affinity: bind worker memory to local NUMA node#15352
Conversation
Suricata already pins worker threads to NUMA-local CPUs via the existing cpu-affinity logic, but every heap allocation those workers make still goes through plain malloc and lands on whatever NUMA node the kernel picked for the first touch. When workers and their flow records end up on different nodes the cross-NUMA traffic shows up as packet loss on multi-socket boxes. Add a new threading.set-memory-affinity option, opt-in, that calls numa_set_localalloc() on each worker right after CPU pinning so all subsequent allocations on that thread prefer the local node. Requires libnuma at build time and is a no-op on single-node systems. Bug: OISF#8141.
|
Hi Samaresh, |
|
NOTE: This PR may contain new authors. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #15352 +/- ##
==========================================
- Coverage 82.66% 82.63% -0.04%
==========================================
Files 995 996 +1
Lines 271046 271141 +95
==========================================
- Hits 224069 224060 -9
- Misses 46977 47081 +104
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
For smoothing out the further process, I am attaching also: Signing a CLA agreement can be found at: |
|
@lukashino I do have a Redmine account already, my username there is ssam18. Please assign the ticket to me when you get a chance, thanks. |
yes, it is ssam18 |
|
Thanks for the links @lukashino. I have already signed the OISF Contribution Agreement and I have gone through the contribution process page, so I am set on that side. Ready to keep iterating on the PR whenever you have time to review. |
Make sure these boxes are checked accordingly before submitting your Pull Request.
Contribution style:
https://docs.suricata.io/en/latest/devguide/contributing/contribution-process.html
Our Contribution agreements:
https://suricata.io/about/contribution-agreement/ (note: this is only required once)
Changes (if applicable):
(including schema descriptions)
https://redmine.openinfosecfoundation.org/projects/suricata/issues
Link to ticket: https://redmine.openinfosecfoundation.org/issues/8141
This is the v2 iteration of #15350 incorporating the maintainer review request to wrap the commit body at the project line width and to add the ticket reference inside the commit message.
Suricata already pins worker threads to NUMA-local CPUs through the existing cpu-affinity logic, but the heap allocations those workers make still go through plain malloc and land on whatever NUMA node the kernel happens to pick for the first touch. On dual socket DPDK boxes this lines up cleanly only when every worker first allocation happens after pinning, which is not always the case for shared structures created by the main thread before fan out, so cross NUMA traffic shows up as packet loss on the worker side. This change adds a new opt in threading.set-memory-affinity setting that calls numa_set_localalloc on each worker right after the CPU pin, so every subsequent allocation on that thread prefers the local node and falls back to other nodes only if the local one is full. The option is gated on libnuma at build time and is a no-op on single node systems, so default behavior is unchanged.
Replaces #15350.
Describe changes:
Provide values to any of the below to override the defaults.
link to the pull request in the respective
_BRANCHvariable.SV_REPO=
SV_BRANCH=
SU_REPO=
SU_BRANCH=