Skip to content

Please pull 26.04 linux nvidia.glue#428

Open
fyu1 wants to merge 80 commits into
NVIDIA:26.04_linux-nvidiafrom
fyu1:26.04_linux-nvidia.glue
Open

Please pull 26.04 linux nvidia.glue#428
fyu1 wants to merge 80 commits into
NVIDIA:26.04_linux-nvidiafrom
fyu1:26.04_linux-nvidia.glue

Conversation

@fyu1
Copy link
Copy Markdown
Collaborator

@fyu1 fyu1 commented May 18, 2026

This MPAM PR has 4 parts:

1-47: backported from upstream
48: enable RESCTRL_FS
49-52: forward ported from 6.17 hwe
53: fix issues on Grace
Please review and merge to 7.0 hwe.

@nirmoy
Copy link
Copy Markdown
Collaborator

nirmoy commented May 18, 2026

Boro watcher review skipped

The GitHub watcher skips automatic boro reviews for PRs with more than 50 commits. This PR currently has 80 commits.

To run the review anyway, ask BaseOS_Kernel_Bot in #baseos-kernel:

review https://github.com/NVIDIA/NV-Kernels/pull/428

Head: 91f12a38b598

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher sees a newer PR head.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ❌ Errors found

Details
Checking 80 commits...

Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 91f12a38b598 │ [SAUCE] resctrl/mpam: reset ris by applying explicit default con │ N/A        │ N/A     │ sdonthin, fenghuay        │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2eda97025439 │ [SAUCE] arm_mpam: resctrl: add the glue code to convert to/from  │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b7160b35eb55 │ [SAUCE] fs/resctrl: add l2 and l3 'max' resource schema          │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 6f31ecd69ca9 │ [SAUCE] fs/resctrl: expose the schema format to user-space       │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 46209ad7affe │ [SAUCE] fs/resctrl: add fflags_from_schema() for files based on  │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ d050e66b25ea │ [SAUCE] fs/resctrl: add additional files for percentage and bitm │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ eb3b4e6dae6f │ [SAUCE] x86/resctrl: move over to specifying mba control formats │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b62159dde918 │ [SAUCE] fs/resctrl: add specific schema types for 'range'        │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 1d078a92e2e8 │ [SAUCE] fs/resctrl: add a schema format to the schema, allowing  │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 206338f606ea │ [SAUCE] fs/resctrl: rename resctrl_get_default_ctrl() to include │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 15a239d1d962 │ [SAUCE] fs/resctrl: move mba supported check to parse_line() ins │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ af41c8253d2b │ [SAUCE] fs/resctrl: abstract duplicate domain test to a helper   │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 7d1f6817689a │ [SAUCE] fs/resctrl: group all the mba specific properties in a s │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 78a0e4578c72 │ [SAUCE] arm_mpam: rename mbw conversion to 'fract16' for code re │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 230340c575ec │ [SAUCE] arm_mpam: allow cmax/cmin to be configured               │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ cf61206d0c6a │ [SAUCE] fs/restrl: allow the overflow handler to be disabled     │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2df1e9c80ca3 │ [SAUCE] arm_mpam: resctrl: determine if any exposed counter can  │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 380cb1ddcb31 │ [SAUCE] x86/resctrl: add stub to allow other architecture to dis │ N/A        │ N/A     │ morse, fenghuay           │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 1c1391cf8823 │ [SAUCE] fs/resctrl,x86/resctrl: factor mba rounding to be per-ar │ N/A        │ N/A     │ Martin, morse, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 53b574b7d628 │ [SAUCE] arm_mpam: add resctrl_arch_round_bw()                    │ N/A        │ N/A     │ Martin, morse, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 281733bead7d │ [SAUCE] arm64: mpam: add memory bandwidth usage (mbwu) documenta │ N/A        │ N/A     │ morse, horgan, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 10bbd9316dee │ [SAUCE] arm_mpam: resctrl: add resctrl_arch_cntr_read() & resctr │ N/A        │ N/A     │ morse, horgan, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ cbf84498806c │ [SAUCE] arm_mpam: resctrl: add resctrl_arch_config_cntr() for ab │ N/A        │ N/A     │ morse, horgan, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 20c17a1e015e │ [SAUCE] arm_mpam: resctrl: pre-allocate assignable monitors      │ N/A        │ N/A     │ morse, horgan, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b17e3d75e1f7 │ [SAUCE] arm_mpam: resctrl: pick classes for use as mbm counters  │ N/A        │ N/A     │ morse, horgan, fenghuay   │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 5a8df648c704 │ [SAUCE] fs/resctrl: document tasks file behaviour for task id 0  │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 989ae92452fb │ [SAUCE] fs/resctrl: document that automatic counter assignment i │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 0c0dbab2d628 │ [SAUCE] fs/resctrl: continue counter allocation after failure    │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 53df665dd64d │ [SAUCE] fs/resctrl: add monitor property 'mbm_cntr_assign_fixed' │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ aa2cb0973710 │ [SAUCE] fs/resctrl: disallow the software controller when mbm co │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 4fbfd0d9e773 │ [SAUCE] x86,fs/resctrl: create 'event_filter' files read only if │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 27aabdb582d5 │ [SAUCE] fs/resctrl: tidy up the error path in resctrl_mkdir_even │ N/A        │ N/A     │ horgan, bp, fenghuay      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ bd34c004fc3f │ [SAUCE] update annotations to set config_resctrl_fs              │ N/A        │ N/A     │ fenghuay                  │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2b35978f2ac7 │ 4d5bbbafc170 arm_mpam: resctrl: Make resctrl_mon_ctx_waiters sta │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ e72b1c4ab7f8 │ 67c0a487efa5 arm_mpam: resctrl: Fix the check for no monitor com │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 269502319a3f │ f758340da529 arm_mpam: resctrl: Fix MBA CDP alloc_capable handli │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ c23956bf8503 │ 79727019ce3d fs/resctrl: Add missing return value descriptions   │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ f9dcb98affb7 │ c611752be9d7 MAINTAINERS: Update resctrl entry                   │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 613413511122 │ d2bf45d067c7 fs/resctrl: Add "*" shorthand to set io_alloc CBM f │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 29b58df7d8cc │ d06b8e7c97c3 fs/resctrl: Report invalid domain ID when parsing i │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 14eb35802431 │ 4ce0a2ccc035 arm64: mpam: Add initial MPAM documentation         │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ dac3a37207c9 │ aeb8595a5f8b arm_mpam: Quirk CMN-650's CSU NRDY behaviour        │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 6f22bc67499a │ dc48eb1ff27c arm_mpam: Add workaround for T241-MPAM-6            │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 7b8a4371c8b5 │ a7efe23ed6dd arm_mpam: Add workaround for T241-MPAM-4            │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ af9bb0c870bf │ 70e81fbedc65 arm_mpam: Add workaround for T241-MPAM-1            │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 9b8f90c62982 │ fa7745218c98 arm_mpam: Add quirk framework                       │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 358528ec02c9 │ fb481ec08699 arm_mpam: resctrl: Call resctrl_init() on platforms │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 06c8243ec46f │ 4aab135bda16 arm64: mpam: Select ARCH_HAS_CPU_RESCTRL            │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ deee6b76199d │ ec9a788620be ALSA: usb-audio: Replace hard-coded number with MAX │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 9206d543df4f │ efc775eadce2 arm_mpam: resctrl: Add empty definitions for assort │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ bd2671915dee │ 49b04e401825 arm_mpam: resctrl: Update the rmid reallocation lim │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ d0cec11cb7d0 │ fb56b29932ca arm_mpam: resctrl: Add resctrl_arch_rmid_read()     │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 52865ff71811 │ 2a3c79c61539 arm_mpam: resctrl: Allow resctrl to allocate monito │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 18e97392420f │ 1458c4f05335 arm_mpam: resctrl: Add support for csu counters     │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ bbffbdc60dac │ 264c285999fc arm_mpam: resctrl: Add monitor initialisation and d │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 0664e8fc3954 │ 5dc8f73eaa5d arm_mpam: resctrl: Add kunit test for control forma │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 7f2f29def984 │ 36528c7681b8 arm_mpam: resctrl: Add support for 'MB' resource    │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 434f7277e4f3 │ 1c1e2968a860 arm_mpam: resctrl: Wait for cacheinfo to be ready   │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 3a955712477c │ 3e9b35823aab arm_mpam: resctrl: Add rmid index helpers           │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ a97fb28ede04 │ 80d147d29313 arm_mpam: resctrl: Convert to/from MPAMs fixed-poin │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 3be86ae47886 │ 01a0021f6c39 arm_mpam: resctrl: Hide CDP emulation behind CONFIG │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 211daf5feb08 │ 6789fb99282c arm_mpam: resctrl: Add CDP emulation                │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 3cf3f97e9737 │ 9d2e1a99fae5 arm_mpam: resctrl: Add plumbing against arm64 task  │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 1b483bb42ae0 │ 9cd2b522be2c arm_mpam: resctrl: Implement helpers to update conf │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 0001207e00fb │ 02cc66168788 arm_mpam: resctrl: Add resctrl_arch_get_config()    │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2fa62062b48d │ 370d166d878d arm_mpam: resctrl: Implement resctrl_arch_reset_all │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2dcb68165d46 │ 52a4edb16121 arm_mpam: resctrl: Pick the caches we will use as r │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 15ec9547884a │ 09e61daf8e96 arm_mpam: resctrl: Add boilerplate cpuhp and domain │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 9d55c1f97a87 │ 2cf9ca3fae38 arm64: mpam: Add helpers to change a task or cpu's  │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b462dda4ef7d │ 37fe0f984d9c arm64: mpam: Initialise and context switch the MPAM │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 579a4634b812 │ 735dad999905 arm64: mpam: Add cpu_pm notifier to restore MPAM sy │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 43ae4efa6974 │ 831a7f16728c arm64: mpam: Advertise the CPUs MPAM limits to the  │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 766367a9ab86 │ c544f00a4732 arm64: mpam: Drop the CONFIG_EXPERT restriction     │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b9689091cb92 │ 87b78a5d70e8 arm64: mpam: Re-initialise MPAM regs when CPU comes │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 7493366afc67 │ 8e06d04ff1cf arm64: mpam: Context switch the MPAM registers      │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 5b364cf656e7 │ 2e7c684bdb50 KVM: arm64: Make MPAMSM_EL1 accesses UNDEF          │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ fca70552c9a0 │ eda1cd1f9d29 KVM: arm64: Preserve host MPAM configuration when c │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2eeb6cc7155b │ 29fa1be82b83 arm64/sysreg: Add MPAMSM_EL1 register               │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 12138ae2943d │ a1cb6577f575 arm_mpam: Reset when feature configuration bit unse │ match      │ match   │ preserved + fenghuay adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ ce3d6dad81e9 │ f91e913355f4 arm_mpam: Ensure in_reset_state is false after appl │ match      │ match   │ preserved + fenghuay adde │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint results:
W: 358528ec02c9 ("arm_mpam: resctrl: Call resctrl_init() on platform"): subject 76 chars (>72)
W: 9d55c1f97a87 ("arm64: mpam: Add helpers to change a task or cpu's"): subject 73 chars (>72)

PR metadata:
W: PR title missing [<branch>] prefix: "Please pull 26.04 linux nvidia.glue"
E: PR targets 26.04_linux-nvidia but body has no https://bugs.launchpad.net/... link

@fyu1 fyu1 changed the title 26.04 linux nvidia.glue Please pull 26.04 linux nvidia.glue May 18, 2026
@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented May 18, 2026

@fyu1

cac60bd NVIDIA: SAUCE: arm_mpam: Include all associated
9515328 NVIDIA: SAUCE: arm_mpam: resctrl: Pre-allocate assignable monitors
02743bd NVIDIA: SAUCE: arm_mpam: resctrl: Pre-allocate free running monitors
43f25ff NVIDIA: SAUCE: untested: arm_mpam: resctrl: pick classes for use as mbm counters

What function are these patches providing? It is a Grace feature?

What are the upstream plans for these patches? (It looks like there were part of the MPAM Part 2 series at one point but were dropped?)


I verified the 47 patches from upstream were clean picks. No issues with those or with the annotations patch.


cac60bd NVIDIA: SAUCE: arm_mpam: Include all associated

Codex found 2 issues with this patch...

cac60bd drops a source hunk from ac1e5be in drivers/resctrl/mpam_devices.c.

The source patch changes mpam_ris_get_affinity() so memory-class components with empty affinity, or memory classes above level 3, are associated with cpu_possible_mask. The target commit does not carry that over. Current code still just
does:

drivers/resctrl/mpam_devices.c:505

case MPAM_CLASS_MEMORY:
get_cpumask_from_node_id(comp->comp_id, affinity);
/* affinity may be empty for CPU-less memory nodes */
break;

The source has:

if (cpumask_empty(affinity)) {
dev_warn_once(..., "CPU-less numa node");
cpumask_copy(affinity, cpu_possible_mask);
} else if (class->level > 3)
cpumask_copy(affinity, cpu_possible_mask);

That matters because cac60bd changes CPU online/offline handling to iterate all components whose affinity contains the CPU. Without the affinity hunk, CPU-less memory nodes stay empty, and level >3 memory components stay tied to their
NUMA node mask instead of all CPUs. That undermines the “include all associated” behavior for those components.

The [fenghuay:] note should be improved. It currently does not mention omitting the mpam_ris_get_affinity() hunk. I think this is not just an annotation problem; the hunk should likely be added unless there is a deliberate branch-specific
reason to omit it.


cac60bd duplicates for_each_mpam_resctrl_control(). The exact same macro is defined twice in drivers/resctrl/mpam_resctrl.c. That is a cleanup/build-hygiene issue, and the commit note should not say “adds” that macro if it was already present.

Comment thread drivers/resctrl/mpam_devices.c
@nirmoy
Copy link
Copy Markdown
Collaborator

nirmoy commented May 19, 2026

Boro review

Latest watcher review: open review

Head: f3404d4f7d2d

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

@nirmoy nirmoy added help wanted Extra attention is needed question Further information is requested labels May 21, 2026
@nvidia-bfigg nvidia-bfigg force-pushed the 26.04_linux-nvidia branch 2 times, most recently from bbda548 to 837b23f Compare May 22, 2026 19:36
@fyu1 fyu1 force-pushed the 26.04_linux-nvidia.glue branch 2 times, most recently from c9f54eb to 3889fc6 Compare May 26, 2026 02:27
@nirmoy nirmoy added pending_review_comment and removed question Further information is requested labels May 27, 2026
@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator

Hi @fyu1, can you please rebase this PR?

@fyu1 fyu1 force-pushed the 26.04_linux-nvidia.glue branch from 3889fc6 to 673d4c3 Compare May 28, 2026 22:22
@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator

Comment thread drivers/resctrl/mpam_resctrl.c
Comment thread fs/resctrl/rdtgroup.c
@fyu1
Copy link
Copy Markdown
Collaborator Author

fyu1 commented May 29, 2026

duplicate for_each_mpam_resctrl_control

Duplicate for_each_mpam_resctrl_control() is fixed.

@fyu1
Copy link
Copy Markdown
Collaborator Author

fyu1 commented May 29, 2026

@fyu1

cac60bd NVIDIA: SAUCE: arm_mpam: Include all associated 9515328 NVIDIA: SAUCE: arm_mpam: resctrl: Pre-allocate assignable monitors 02743bd NVIDIA: SAUCE: arm_mpam: resctrl: Pre-allocate free running monitors 43f25ff NVIDIA: SAUCE: untested: arm_mpam: resctrl: pick classes for use as mbm counters

What function are these patches providing? It is a Grace feature?

These are for assignment mode for mba and mbm. Without these, mba/mbm won't work. They are not Grace specific.

What are the upstream plans for these patches? (It looks like there were part of the MPAM Part 2 series at one point but were dropped?)
They are being reviewed on LKML: https://lore.kernel.org/lkml/20260520212458.1797221-4-ben.horgan@arm.com/

I verified the 47 patches from upstream were clean picks. No issues with those or with the annotations patch.

cac60bd NVIDIA: SAUCE: arm_mpam: Include all associated

Codex found 2 issues with this patch...

cac60bd drops a source hunk from ac1e5be in drivers/resctrl/mpam_devices.c.

The source patch changes mpam_ris_get_affinity() so memory-class components with empty affinity, or memory classes above level 3, are associated with cpu_possible_mask. The target commit does not carry that over. Current code still just does:

drivers/resctrl/mpam_devices.c:505

case MPAM_CLASS_MEMORY: get_cpumask_from_node_id(comp->comp_id, affinity); /* affinity may be empty for CPU-less memory nodes */ break;

The source has:

if (cpumask_empty(affinity)) { dev_warn_once(..., "CPU-less numa node"); cpumask_copy(affinity, cpu_possible_mask); } else if (class->level > 3) cpumask_copy(affinity, cpu_possible_mask);

That matters because cac60bd changes CPU online/offline handling to iterate all components whose affinity contains the CPU. Without the affinity hunk, CPU-less memory nodes stay empty, and level >3 memory components stay tied to their NUMA node mask instead of all CPUs. That undermines the “include all associated” behavior for those components.

CPU-less feature is not supported on Grace. The code is for MPAM_CLASS_MEMORY type which is not supported on Grace and thus the code won't be executed on Grace. Without the code, CPU-less feature is ignore on Vera, which is expected.

The [fenghuay:] note should be improved. It currently does not mention omitting the mpam_ris_get_affinity() hunk. I think this is not just an annotation problem; the hunk should likely be added unless there is a deliberate branch-specific reason to omit it.

I have updated commit message for omitting this hunk.

cac60bd duplicates for_each_mpam_resctrl_control(). The exact same macro is defined twice in drivers/resctrl/mpam_resctrl.c. That is a cleanup/build-hygiene issue, and the commit note should not say “adds” that macro if it was already present.

Fixed.

@fyu1 fyu1 force-pushed the 26.04_linux-nvidia.glue branch from 673d4c3 to c290cca Compare May 29, 2026 02:40
@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented May 29, 2026

@fyu1

Thanks for addressing my previous issues.


d7c3069 - NVIDIA: SAUCE: resctrl/mpam: reset RIS by applying explicit default config

One other question on the reset-RIS patch: the note says reset_cfg is initialized in mpam_reprogram_ris_partid(), but I don’t see it actually being initialized there. mpam_reset_ris() passes an empty:

  struct mpam_config reset_cfg = {};

and mpam_reprogram_ris_partid() relies on missing cfg feature bits to take reset/default paths.

Most controls look okay with that model, but MBW_PBM looks different:

  if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
          if (mpam_has_feature(mpam_feat_mbw_part, cfg))
                  mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
          else
                  mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
  }

With an empty reset_cfg, that appears to write cfg->mbw_pbm == 0 rather than a full reset mask. Is mpam_feat_mbw_part/MBW_PBM also unsupported on the target platforms? If so, can you update the backport note to say that explicitly?
Otherwise I think this may need a code adjustment to preserve the source reset behavior.

@fyu1 fyu1 force-pushed the 26.04_linux-nvidia.glue branch from c290cca to 54abfa3 Compare June 3, 2026 08:56
James Morse and others added 23 commits June 5, 2026 15:29
…ABMC use

ABMC, mbm_event mode, has a helper resctrl_arch_config_cntr() for changing
the mapping between 'cntr_id' and a CLOSID/RMID pair.

Add the helper.

For MPAM this is done by updating the mon->mbwu_idx_to_mon[] array, and as
usual CDP means it needs doing in three different ways.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
(cherry picked from commit 47b7baa https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…trl_arch_reset_cntr()

When used in 'mbm_event' mode, ABMC emulation, resctrl uses arch hooks to
read and reset the memory bandwidth utilization (MBWU) counters.

Add these.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
(cherry picked from commit ed60492 https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…tation

Memory bandwidth monitoring make uses of MBWU monitors and is now exposed
to the user via resctrl. Add some documentation so the user knows what to
expect.

Co-developed-by: James Morse <james.morse@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
(cherry picked from commit 3219a44 https://gitlab.arm.com/linux-arm/linux-bh.git mpam_abmc_v4)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
Add the required hook to pre-round a userspace memory bandwidth
allocation percentage value to a value acceptable to the driver backend.
For MPAM, no rounding is needed because the driver has all the
information necessary for rounding the value when
resctrl_arch_update_one() is called.
So, just "round" the value to itself here.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 935611d https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `include/linux/arm_mpam.h`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…arch

The control value parser for the MB resource currently coerces the
memory bandwidth percentage value from userspace to be an exact
multiple of the bw_gran parameter.
On MPAM systems, this results in somewhat worse-than-worst-case
rounding, since bw_gran is in general only an approximation to the
actual hardware granularity, and the hardware bandwidth allocation
control value is not natively a percentage.
Allow the arch to provide its own conversion that is appropriate for
the hardware, and move the existing conversion to x86.  This will avoid
accumulated error from rounding the value twice on MPAM systems.
Clarify the documentation, but avoid overly exact promises.
Clamping to bw_min and bw_max still feels generic: leave it in the core
code, for now.
No functional change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit cabdc68 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…isable monitor overflow

Resctrl has an overflow handler that runs on each domain every second
to ensure that any overflow of the hardware counter is accounted for.
MPAM can have counters as large as 63 bits, in which case there is no
need to check for overflow.
To allow other architectures to disable this, add a helper that reports
whether counters can overflow.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 6a4360b https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…n overflow

Resctrl has an overflow handler that runs on each domain every second
to ensure that any overflow of the hardware counter is accounted for.
MPAM can have counters as large as 63 bits, in which case there is no
need to check for overflow.
To allow the overflow handler to be disabled, determine if an overflow
can happen. If a class is not implemented, or has the 63bit counter,
it can't overflow.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 0f6aefd https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `drivers/resctrl/mpam_resctrl.c`;
  - Remove overflow check on QOS_L3_MBM_LOCAL_EVENT_ID since it's not
    supported in MPAM anymore.
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
Resctrl has an overflow handler that runs on each domain every second
to ensure that any overflow of the hardware counter is accounted for.
MPAM can have counters as large as 63 bits, in which case there is no
need to check for overflow.
Call the new arch helpers to determine this.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 72e375a https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
mpam_reprogram_ris_partid() always resets the CMAX/CMIN controls to their
'unrestricted' value.
This prevents the controls from being configured.
Add fields in struct mpam_config, and program these values when they
are set in the features bitmask.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit e701b28 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `drivers/resctrl/mpam_devices.c`;
  - Resolve minor conflicts in `drivers/resctrl/mpam_internal.h`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…re-use

Functions like mbw_max_to_percent() convert a value into MPAMs 16 bit
fixed point fraction format. These are not only used for memory
bandwidth, but cache capcity controls too.
Rename these functions to convert to/from a 'fract16', and add
helpers for the specific mbw_max/cmax controls.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 738f160 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `drivers/resctrl/mpam_resctrl.c`;
  - Resolve minor conflicts in `drivers/resctrl/test_mpam_resctrl.c`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
… separate struct

struct resctrl_membw combines parameters that are related to the control
value, and parameters that are specific to the MBA resource.
To allow the control value parsing and management code to be re-used for
other resources, it needs to be separated from the MBA resource.
Add struct resctrl_mba that holds all the parameters that are specific
to the MBA resource.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit c113346 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
parse_cbm() and parse_bw() both test the staged config for an existing
entry. These would indicate user-space has provided a schema with a
duplicate domain entry. e.g:
| L3:0=ffff;1=f00f;0=f00f
If new parsers are added this duplicate domain test has to be duplicated.
Move it to the caller.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 827c80b https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `fs/resctrl/ctrlmondata.c`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…nstead of parse_bw()

MBA is only supported on platforms where the delay inserted by the control
is linear. Resctrl checks the two properties provided by the arch code
match each time it parses part of a new control value.
This doesn't need to be done so frequently, and obscures changes to
parse_bw() to abstract it for use with other control types.
Move this check to the parse_line() caller so it only happens once.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 85be43b https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…de resource

resctrl_get_default_ctrl() is called by both the architecture code and
filesystem code to return the default value for a control. This depends
on the schema format.
parse_bw() doesn't bother checking the bounds it is given if the
resource is in use by mba_sc. This is because the values parsed from
user-space are not the same as those the control should take.
To make this disparity easier to work with, a second different copy
of the schema format is needed, which would need a version of
resctrl_get_default_ctrl(). This would let the resctrl change the
schema format presented to user-space, provided it converts it to match
what the architecture code expects.
Rename resctrl_get_default_ctrl() to make it clear it returns the
resource default.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit a4ba73c https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `drivers/resctrl/mpam_resctrl.c`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…g it to be different

parse_bw() doesn't bother checking the bounds it is given if the
resource is in use by mba_sc. This is because the values parsed from
user-space are not the same as those the control should take.
To make this disparity easier to work with, a second different copy
of the schema format is needed, which would need a version of
resctrl_get_default_ctrl(). This would let the resctrl change the
schema format presented to user-space, provided it converts it to match
what the architecture code expects.
Add a second schema format for use with mba_sc. The membw properties
are copied and the schema version is used. When mba_sc is enabled
the schema copy of these properties is modified.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 225d28e https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `fs/resctrl/ctrlmondata.c`;
  - Resolve minor conflicts in `include/linux/arm_mpam.h`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
Resctrl allows the architecture code to specify the schema format for a
control. Controls can either take a bitmap, or some kind of number.
If user-space doesn't know what a control is by its name, it could be
told the schema format. 'Some kind of number' isn't useful as the
difference between a percentage and a value in MB/s affects how these
would be programmed, even if resctrl's parsing code doesn't need to
care.
Add the types resctrl already has in addition to 'range'. This
allows architectures to move over before 'range' is removed. These
new schema formats are parsed the same, but will additionally affect
which files are visible.
Schema formats with a double underscore should not be considered
portable between architectures, and are likely to be described to
user-space as 'platform defined'. AMDs MBA resource is configured
with an absolute bandwidth measured in multiples of one eighth of
a GB per second. resctrl needs to be aware of this platform
defined format to ensure the existing 'MB' files continue to be
shown.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit bb81e48 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
Resctrl specifies the schema format for MB and SMBA in rdt_resources_all[].
Intel platforms take a percentage for MB, AMD platforms take an absolute
value which isn't MB/s. Currently these are both treated as a 'range'.
Adding support for additional types of control shows that user-space
needs to be told what the control formats are. Today users of resctrl
must already know if their platform is Intel or AMD to know how the
MB resource will behave.
The MPAM support exposes new control types that take a 'percentage'.
The Intel MB resource is also configured by a percentage, so should be
able to expose this to user-space.
Remove the static configuration for schema_fmt in rdt_resources_all[]
and specify it with the other control properties in
__get_mem_config_intel() or __get_mem_config_amd().

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 3323499 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `arch/x86/kernel/cpu/resctrl/core.c`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…tmap controls

MPAM has cache capacity controls that effectively take a percentage.
Resctrl supports percentages, but the collection of files that are
exposed to describe this control belong to the MB resource.
To find the minimum granularity of the percentage cache capacity controls,
user-space is expected to rad the banwdidth_gran file, and know this has
nothing to do with bandwidth.
The only problem here is the name of the file. Add duplicates of these
properties with percentage and bitmap in the name. These will be exposed
based on the schema format.
The existing files must remain tied to the specific resources so that
they remain visible to user-space. Using the same helpers ensures the
values will always be the same regardless of the file used.
These files are not exposed until the new RFTYPE schema flags are
set on a resource 'fflags'.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit a38c116 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `fs/resctrl/internal.h`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…n schema format

MPAM has cache capacity controls that effectively take a percentage.
Resctrl supports percentages, but the collection of files that are
exposed to describe this control belong to the MB resource. New files
have been added that are selected based on the schema format.
Apply the flags to enable these files based on the schema format.
Add a new fflags_from_schema() that is used for controls.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit db00568 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `fs/resctrl/rdtgroup.c`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
If more schemas are added to resctrl, user-space needs to know how to
configure them. To allow user-space to configure schema it doesn't know
about, it would be helpful to tell user-space the format, e.g. percentage.
Add a file under info that describes the schema format.
Percentages and 'mbps' are implicitly decimal, bitmaps are expected to be
in hex.

Signed-off-by: James Morse <james.morse@arm.com>
(forward ported from commit f0ae691 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `fs/resctrl/rdtgroup.c`;
  - Add RESCTRL_SCHEMA_RANGE in resctrl_schema_format_show();
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
MPAM can have both cache portion and cache capacity controls on any cache
that supports MPAM. Cache portion bitmaps can be exposed via resctrl if
they are implemented on L2 or L3.
The cache capacity controls can not be used to isolate portions, which is
in implicit in the L2 or L3 bitmap provided by user-space. These controls
need to be configured with something more like a percentage.
Add the resource enum entries for these two resources. No additional
resctrl code is needed because the architecture code will specify this
resource takes a 'percentage', re-using the support previously used only
for the MB resource.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 2e9f961 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `include/linux/resctrl.h`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…m cmax

MPAM's maximum cache-capacity controls take a fixed point fraction format.
Instead of dumping this on user-space, convert it to a percentage.
User-space using resctrl already knows how to handle percentages.

Signed-off-by: James Morse <james.morse@arm.com>
(cherry picked from commit 10caa12 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
  - Resolve minor conflicts in `drivers/resctrl/mpam_resctrl.c`;
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
…onfig

Reset an RIS by building a default mpam_config and applying it via
mpam_reprogram_ris_partid(), like any other config.

- mpam_init_reset_cfg(): set features and default values only for
  controls supported by the RIS (cpor_part, mbw_part, mbw_max,
  mbw_prop, cmax_cmax, cmax_cmin). Use full masks for CPBM/MBW_PBM
  and MPAMCFG_* defaults for MBW_MAX, CMAX, CMIN.
- mpam_reprogram_ris_partid(): apply cfg for all supported controls
  (no separate reset path).

Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
(forward ported from commit e0b6de0 https://github.com/NVIDIA/NV-Kernels 24.04_linux-nvidia-6.17-next)
[fenghuay:
Since upstream code changes, a few changes in previous commit e0b6de0
are irrevelant or are merged in upstream already. Still keep the commit
messge to keep the history but add the following change log:
  - reset_cpbm and reset_mbw_pbm are not used. no need to define them;
  - Resolve minor conflicts in `drivers/resctrl/mpam_devices.c`;
  - mpam_init_reset_cfg() has been removed from upstream.
    Empty reset_cfg is used in mpam_reprogram_ris_partid(&reset_cfg)
    to reset feature values;
  - Remove changes of fract16_to_percent() and percent_to_fract16() since
    they don't take wd and don't match definition of fixed-point fractional
    format defined in MPAM spec.
  - MBW_PBM is set incorrectly. Revert its setting with/without cfg in
    mpam_reprogram_ris_partid().
]
Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>
@fyu1
Copy link
Copy Markdown
Collaborator Author

fyu1 commented Jun 5, 2026

@fyu1

Same comment as Jamie about the missing "NVIDIA: SAUCE: arm_mpam: Include all associated” patch.

712640a NVIDIA: SAUCE: resctrl/mpam: reset RIS by applying explicit default config

This patch is documented as a forward-port of e0b6de0, but it only carries a tiny subset of the source behavior.

In source e0b6de0, the patch is large:

  • removes mpam_reset_msc_bitmap()
  • removes reset_cpbm / reset_mbw_pbm from struct mpam_config
  • changes mpam_reprogram_ris_partid() so reset is driven by an explicit default mpam_config
  • adds/uses mpam_init_reset_cfg(&reset_cfg, &ris->props)
  • changes mpam_reset_component_cfg() to initialize explicit defaults and feature bits
  • changes fract16_to_percent() / percent_to_fract16()

But picked commit 712640a only adds:

if (cprops->cmax_wd) comp->cfg[i].cmax = MPAMCFG_CMAX_CMAX;

in drivers/resctrl/mpam_devices.c:2632.

That means the annotation is misleading in several places:

  • It says reset_cpbm / reset_mbw_pbm do not need to be defined, but that is not really the meaningful forward-port explanation. The picked tree still uses the older reset helper path via drivers/resctrl/mpam_devices.c:1441.
  • It says mpam_init_reset_cfg() was removed upstream and reset_cfg is initialized in mpam_reprogram_ris_partid(&reset_cfg), but the current code still passes an empty reset_cfg from drivers/resctrl/mpam_devices.c:1721.
  • It says fract16_to_percent() and percent_to_fract16() were removed, but both still exist in drivers/resctrl/mpam_resctrl.c:764.
    I changed the change log to say "changes of fract16_to_percent() and percent_to_fract16() were removed.." to make it clear that only changes in the previous commit were removed, not the entire functions.

So my concern is not just “patch-id mismatch”; it is that the annotation claims several source hunks were handled, while the actual picked commit only initializes cmax in component config. Either the source behavior was already present/dropped elsewhere and the annotation should say that clearly, or this forward-port missed most of the source patch.
As discussed on meeting, I will keep the commit message and add the changes in the change log to make tracking easier.

@fyu1
Copy link
Copy Markdown
Collaborator Author

fyu1 commented Jun 5, 2026

@fyu1
Thanks for addressing my previous issues.
d7c3069 - NVIDIA: SAUCE: resctrl/mpam: reset RIS by applying explicit default config
One other question on the reset-RIS patch: the note says reset_cfg is initialized in mpam_reprogram_ris_partid(), but I don’t see it actually being initialized there. mpam_reset_ris() passes an empty:

  struct mpam_config reset_cfg = {};

and mpam_reprogram_ris_partid() relies on missing cfg feature bits to take reset/default paths.
Most controls look okay with that model, but MBW_PBM looks different:

  if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
          if (mpam_has_feature(mpam_feat_mbw_part, cfg))
                  mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
          else
                  mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
  }

With an empty reset_cfg, that appears to write cfg->mbw_pbm == 0 rather than a full reset mask. Is mpam_feat_mbw_part/MBW_PBM also unsupported on the target platforms? If so, can you update the backport note to say that explicitly? Otherwise I think this may need a code adjustment to preserve the source reset behavior.

@fyu1 In the latest update (Jun 3 - range: a71d5d9^..712640a) this does not appear to be addressed.

The current replacement appears to be 712640a rather than d7c3069, but the relevant code is still the same:

  • drivers/resctrl/mpam_devices.c:1721 still creates struct mpam_config reset_cfg = {};
  • drivers/resctrl/mpam_devices.c:1571 still has the suspicious MBW_PBM logic:
  if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
  	if (mpam_has_feature(mpam_feat_mbw_part, cfg))
  		mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
  	else
  		mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
  }

With the empty reset_cfg, this still takes the else path and writes cfg->mbw_pbm == 0.

The annotation was changed since d7c3069, but not in a way that resolves the above finding. Current 712640a still says reset_cfg is initialized in mpam_reprogram_ris_partid(&reset_cfg), and it still does not say mpam_feat_mbw_part / MBW_PBM is unsupported on the target platforms.

The current note either needs to explicitly justify why MBW_PBM cannot matter here, or the code needs to preserve the source reset behavior for MBW_PBM.

@fyu1 fyu1 closed this Jun 5, 2026
@fyu1 fyu1 reopened this Jun 5, 2026
@fyu1 fyu1 force-pushed the 26.04_linux-nvidia.glue branch from 712640a to 91f12a3 Compare June 5, 2026 15:39
@fyu1
Copy link
Copy Markdown
Collaborator Author

fyu1 commented Jun 5, 2026

The updated PR has the following changes:

  1. Fix MBW_PBM setting in mpam_reprogram_ris_partid() (Matt)
  2. Remove QOS_L3_MBM_LOCAL_EVENT_ID checking in resctrl_arch_mon_can_overflow() (Jamie)
  3. Change change log in 91f12a3 to clarify a lot of changes from original 6.17 hwe commit (Matt and Jamie)

@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented Jun 5, 2026

@fyu1

I re-reviewed the latest and it looks much better, thanks!

One minor nit: the new 91f12a3 note has typos: irrevelant and messge.

Other than that, I will wait for you to replace the inline MBW_PBM fix with a standalone fix commit as discussed during the scrub meeting this AM. Once that is present I think we can get this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

help wanted Extra attention is needed pending_review_comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants