[linux-nvidia-6.17-next] Add CXL Type-2 device support, RAS error handling, reset, state save/restore, and interleaving support#342
Conversation
|
@JiandiAnNVIDIA Finished going through this PR and have some questions / comments... 862702c NVIDIA: VR: SAUCE: [Config] CXL config annotations for Type-2 device and RAS support These are already set in master annotations, why do we need to add them to nvidia annotations? They must be built-in now? Should this be removed from master if no longer in the code? I confirmed the LKML backports match their origin and appear to have proper tags. 1a7d28e PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h Why is this pulling in all the PCI_IDE stuff? That is not part of the original commit. This may pick easier if you also pick "f16469ee733a PCI/IDE: Enumerate Selective Stream IDE capabilities” before it. Or when fixing up the collision, don’t include the PCI_IDE content. 3639a51 cxl/test: Add cxl_test CFMWS support for extended linear cache Since there was an adjustment with these, need to change “cherry picked from” to “backported from”. 8c829ab cxl/test: Add support for acpi extended linear cache Did these pick clean? There were some context differences in the diff, but git may have been able to handle the merge okay. Just want to double check. |
|
Do we need the last 3 annotations? they were not used at the last PR. |
Nice find. Not including the PCI_IDE content would seem like the cleaner approach here (in my opinion, at least). Although if you wanted to go with the former, I think you may want this one "254599fc8301 PCI: Add PCIe Device 3 Extended Capability enumeration" in addition to the commit that Matt called out. Other than that, all I have is this nit that Cursor found:
|
The HWQA and teams with NVBugs doing testing / debugging for type3 and type2 CXL devices need these as y. Why m is NOT okay for CXL_BUS, CXL_PCI, CXL_MEM, CXL_PORT These are tristate with default CXL_BUS. If CXL_BUS=m, everything defaults to m. With m, the CXL subsystem loads as modules — that's fine for normal memory expansion use cases but not for Type-2 device support, where the CXL infrastructure must be present before driver probe of the accelerator device.
I initially removed this and added CONFIG_CXL_RAS in debian.master, thinking this is what Ubuntu kernel maintainer would do when they move to kernel v7.0 or above where Terry Bowman's patch that did this replacement is merged. But I thought for now Terry's patches (although merged in v7.0 already) is only applied to the linux-nvidia-6.17-next kernel. So I thought maybe just not change any debian.master and override in debian.nvidia-6.17. I can remove it from debian.master and not add CONFIG_CXL_RAS to debian.master. Just add CONFIG_CXL_RAS to debian.nvidia-6.17
Good catch. Will fix. This commit gave multiple conflicts because one of Nicolin's earlier commit "df59703f696a iommu/arm-smmu-v3: Allow ATS to be always on" added some stuff while this commit redefines it and moving it in a different place in the file. THE PCI_IDE_* stuff was the anchor of the upstream commit. Since the CXL DVSEC area also had a real conflict (from the NVIDIA SAUCE CXL_DVSEC_CACHE_CAPABLE commit 72bd823), the whole file had multiple conflict regions. The likely resolution was git checkout --theirs -- include/uapi/linux/pci_regs.h, which accepts the entire upstream file at the 0f7afd8 commit tree — including all the PCI_IDE_* content from f16469e that was never in the cherry-pick list.
Will fix
These picks did not hit conflict. the git cherry-pick using 3-way merge was able to handle the merge. The 6.17 HWE branch already has Koba's commit and Vishal's zero size decoder fix commit. For example, Vishal's zero size decoder fix commit added something that'll cause lines shifts when picking d568723 cxl: Add a cached copy of target_map to cxl_decoder. And my conflict resolving changes on some commits causes subsequent commits to shift during 3-way merge. |
Will fix. |
Previous PR was picking anything touching drivers/cxl between 6.17.9 and 6.18-rc5 where Terry Bowman's v13 and Alejandro Lucero's v22's base was. So this PR pulled in more commits which include the following: c460697 lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION 4d873c5 arm64: Select GENERIC_CPU_CACHE_MAINTENANCE 2ec3b54 cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent
|
Understood. Thanks for explaining.
Let's try this approach and see what Canonical says during their review.
Thanks for clarifying. |
e836e83 to
0c0118d
Compare
nirmoy
left a comment
There was a problem hiding this comment.
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
patches looks fine to me. Make sure to do enough regression test on Grace and spark as discussed.
0c0118d to
27ff641
Compare
| if (cxld->interleave_ways != iw || | ||
| (iw > 1 && cxld->interleave_granularity != ig) || | ||
| !spa_maps_hpa(p, &cxld->hpa_range) || | ||
| !region_res_match_cxl_range(p, &cxld->hpa_range) || |
There was a problem hiding this comment.
The latest revision introduces a compilation issue here:
drivers/cxl/core/region.c: In function ‘cxl_port_setup_targets’:
drivers/cxl/core/region.c:1716:22: error: implicit declaration of function ‘region_res_match_cxl_range’ [-Werror=implicit-function-declaration]
1716 | !region_res_match_cxl_range(p, &cxld->hpa_range) ||
This function was renamed by 24366091ed5b
There was a problem hiding this comment.
Thanks. Will fix this. I'm working to add save and restore, cxl reset series next. Was thinking to push as I go to get the interleaving, save and restore, and cxl reset series all applied then work through the issues. Wanted to push first as I had an accident previously where the entire directory with all my patch applied / conflict resolved was deleted and I had to start over.
27ff641 to
9e976e2
Compare
|
Latest push looks good to me.
|
nirmoy
left a comment
There was a problem hiding this comment.
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
9e976e2 to
6c09835
Compare
|
I've backported TLB related patches onto Jiandi's branch, if anyone is available, please review for me, thanks ! |
|
@JiandiAnNVIDIA A few additional comments / questions on your latest updates to the PR: I confirmed that there were only 3 commits from the upstream dependency patches that differed from what I previously reviewed:
I also confirmed that the only change in the existing CONFIG annotations patches was the removal of CONFIG_PCIEAER_CXL from debian.master that I had previously requested (thanks for addressing this as well). I verified all the NVIDIA:SAUCE patches that were present in my original review remain intact, unmodified. For the new NVIDIA:SAUCE patches that added the CXL save/restore and cxl-reset support, except for a few commits that I call out below, I verified that the patches either picked clean or (in cases where modifications were made) that the backport notes were accurate. Followup question on 15609f2 PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h: In my original review, there was a patch just after this patch, 197d61c PCI: Update CXL DVSEC definitions, that is no longer present in the branch. It appears that the content from that patch was squashed into "15609f2ae03c PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h". Can you confirm this? Can we still include that patch? 0ce1557 NVIDIA: VR: SAUCE: PCI: Add HDM decoder state save/restore Nit: Can you expand a bit on the backport note to include “why” cxl.h is needed? i.e. because of conflict resolution in "35460d55daed NVIDIA: VR: SAUCE: cxl: Move HDM decoder and register map definitions to include/cxl/cxl.h” 689a3a3 NVIDIA: VR: SAUCE: PCI: Add CXL DVSEC reset and capability register definitions |
Fixed. Include the original patch for commit to commit match with Terry Bowman's merged series.
Fixed
Fixed. |
6c09835 to
1de21c1
Compare
|
This patch "PCI: Update CXL DVSEC definitions" missed one rename |
Thanks for including this patch again. The backports for these two patches look much better now. Only need to address the remaining renames that Nirmoy pointed out.
Thanks for addressing these...I confirmed the updated backport notes look good. |
1de21c1 to
acf188b
Compare
Fixed. |
… to include/cxl/cxl.h Move CXL HDM decoder register defines, register map structs (cxl_reg_map, cxl_component_reg_map, cxl_device_reg_map, cxl_pmu_reg_map, cxl_register_map), cxl_hdm_decoder_count(), enum cxl_regloc_type, and cxl_find_regblock()/cxl_setup_regs() declarations from internal CXL headers to include/cxl/pci.h. This makes them accessible to code outside the CXL subsystem, in particular the PCI core CXL state save/restore support added in a subsequent patch. No functional change. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/) [jan: Resolve conflicts by moving certain definitions to include/cxl/cxl.h instead of to include/cxl/pci.h to align with its dependency of Alejandro's series] Signed-off-by: Jiandi An <jan@nvidia.com>
…state Add pci_add_virtual_ext_cap_save_buffer() to allocate save buffers using virtual cap IDs (above PCI_EXT_CAP_ID_MAX) that don't require a real capability in config space. The existing pci_add_ext_cap_save_buffer() cannot be used for CXL DVSEC state because it calls pci_find_saved_ext_cap() which searches for a matching capability in PCI config space. The CXL state saved here is a synthetic snapshot (DVSEC+HDM) and should not be tied to a real extended-cap instance. A virtual extended-cap save buffer API (cap IDs above PCI_EXT_CAP_ID_MAX) allows PCI to track this state without a backing config space capability. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Save and restore CXL DVSEC control registers (CTRL, CTRL2), range base registers, and lock state across PCI resets. When the DVSEC CONFIG_LOCK bit is set, certain DVSEC fields become read-only and hardware may have updated them. Blindly restoring saved values would be silently ignored or conflict with hardware state. Instead, a read-merge-write approach is used: current hardware values are read for the RWL (read-write-when-locked) fields and merged with saved state, so only writable bits are restored while locked bits retain their hardware values. Hooked into pci_save_state()/pci_restore_state() so all PCI reset paths automatically preserve CXL DVSEC configuration. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/) [jan: Resolve minor conflict in drivers/pci/Makefile due to code line shifts ] Signed-off-by: Jiandi An <jan@nvidia.com>
Save and restore CXL HDM decoder registers (global control, per-decoder base/size/target-list, and commit state) across PCI resets. On restore, decoders that were committed are reprogrammed and recommitted with a 10ms timeout. Locked decoders that are already committed are skipped, since their state is protected by hardware and reprogramming them would fail. The Register Locator DVSEC is parsed directly via PCI config space reads rather than calling cxl_find_regblock()/cxl_setup_regs(), since this code lives in the PCI core and must not depend on CXL module symbols. MSE is temporarily enabled during save/restore to allow MMIO access to the HDM decoder register block. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/) [jan: Include <cxl/cxl.h> in drivers/pci/cxl.c due to conflict resolution in "4acbc27592b8 NVIDIA: VR: SAUCE: cxl: Move HDM decoder and register map definitions to include/cxl/cxl.h"] Signed-off-by: Jiandi An <jan@nvidia.com>
…efinitions Add CXL DVSEC register definitions needed for CXL device reset per CXL r3.2 section 8.1.3.1: - Capability bits: RST_CAPABLE, CACHE_CAPABLE, CACHE_WBI_CAPABLE, RST_TIMEOUT, RST_MEM_CLR_CAPABLE - Control2 register: DISABLE_CACHING, INIT_CACHE_WBI, INIT_CXL_RST, RST_MEM_CLR_EN - Status2 register: CACHE_INV, RST_DONE, RST_ERR - Non-CXL Function Map DVSEC register offset Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/) [jan: Resolve conflicts where PCI_DVSEC_CXL_CACHE_CAPABLE is already added by "72bd823fb4f1 NVIDIA: VR: SAUCE: PCI: Allow ATS to be always on for CXL.cache capable devices"] Signed-off-by: Jiandi An <jan@nvidia.com>
…_restore() Export pci_dev_save_and_disable() and pci_dev_restore() so that subsystems performing non-standard reset sequences (e.g. CXL) can reuse the PCI core standard pre/post reset lifecycle: driver reset_prepare/reset_done callbacks, PCI config space save/restore, and device disable/re-enable. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
Add infrastructure for quiescing the CXL data path before reset: - Memory offlining: check if CXL-backed memory is online and offline it via offline_and_remove_memory() before reset, per CXL spec requirement to quiesce all CXL.mem transactions before issuing CXL Reset. - CPU cache flush: invalidate cache lines before reset as a safety measure after memory offline. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…XL reset Add sibling PCI function save/disable/restore coordination for CXL reset. Before reset, all CXL.cachemem sibling functions are locked, saved, and disabled; after reset they are restored. The Non-CXL Function Map DVSEC and per-function DVSEC capability register are consulted to skip non-CXL and CXL.io-only functions. A global mutex serializes concurrent resets to prevent deadlocks between sibling functions. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…ration
cxl_dev_reset() implements the hardware reset sequence:
optionally enable memory clear, initiate reset via
CTRL2, wait for completion, and re-enable caching.
cxl_do_reset() orchestrates the full reset flow:
1. CXL pre-reset: mem offlining and cache flush (when memdev present)
2. PCI save/disable: pci_dev_save_and_disable() automatically saves
CXL DVSEC and HDM decoder state via PCI core hooks
3. Sibling coordination: save/disable CXL.cachemem sibling functions
4. Execute CXL DVSEC reset
5. Sibling restore: always runs to re-enable sibling functions
6. PCI restore: pci_dev_restore() automatically restores CXL state
The CXL-specific DVSEC and HDM save/restore is handled
by the PCI core's CXL save/restore infrastructure (drivers/pci/cxl.c).
Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>
(backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/)
Signed-off-by: Jiandi An <jan@nvidia.com>
Add a "cxl_reset" sysfs attribute to PCI devices that support CXL Reset (CXL r3.2 section 8.1.3.1). The attribute is visible only on devices with both CXL.cache and CXL.mem capabilities and the CXL Reset Capable bit set in the DVSEC. Writing "1" to the attribute triggers the full CXL reset flow via cxl_do_reset(). The interface is decoupled from memdev creation: when a CXL memdev exists, memory offlining and cache flush are performed; otherwise reset proceeds without the memory management. The sysfs attribute is managed entirely by the CXL module using sysfs_create_group() / sysfs_remove_group() rather than the PCI core's static attribute groups. This avoids cross-module symbol dependencies between the PCI core (always built-in) and CXL_BUS (potentially modular). At module init, existing PCI devices are scanned and a PCI bus notifier handles hot-plug/unplug. kernfs_drain() makes sure that any in-flight store() completes before sysfs_remove_group() returns, preventing use-after-free during module unload. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…tribute Document the cxl_reset sysfs attribute added to PCI devices that support CXL Reset. Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com> (backported from https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/) Signed-off-by: Jiandi An <jan@nvidia.com>
…and RAS support
Add Ubuntu kernel config annotations for CXL-related configs introduced
or changed by the following cherry-picked patch series:
- drivers/cxl changes between v6.17.9 and upstream 7.0 (which includes
a portion of Terry Bowman's v14 CXL RAS series merged via
for-7.0/cxl-aer-prep)
- Alejandro Lucero's v23 CXL Type-2 device support series
- Smita Koralahalli's v6 patch 3/9 (cxl/region: Skip decoder reset on
detach for autodiscovered regions)
CONFIG_CXL_BUS: Enable CXL bus support built-in; required for
CXL Type-2 device and RAS support
CONFIG_CXL_PCI: Enable CXL PCI management built-in; auto-selects
CXL_MEM; required for CXL Type-2 device support
CONFIG_CXL_MEM: Auto-selected by CXL_PCI; required for CXL
memory expansion and Type-2 device support
CONFIG_CXL_PORT: Required for CXL port enumeration; defaults to
CXL_BUS value
CONFIG_FWCTL: Selected by CXL_BUS when CXL_FEATURES is enabled;
required for CXL feature mailbox access
CONFIG_CXL_RAS: New def_bool replacing PCIEAER_CXL (Terry Bowman
v14); auto-enabled with ACPI_APEI_GHES+PCIEAER+
CXL_BUS for CXL RAS error handling
CONFIG_SFC_CXL: Solarflare SFC9100-family CXL Type-2 device
support; not needed for NVIDIA platforms (n)
CONFIG_ACPI_APEI_EINJ: Required prerequisite for CONFIG_ACPI_APEI_EINJ_CXL
CONFIG_ACPI_APEI_EINJ_CXL: CXL protocol error injection support via APEI EINJ
CONFIG_PCIEAER_CXL: Remove it from debian.master policy. This config
was removed from Kconfig by upstream commit d18f1b7
(PCI/AER: Replace PCIEAER_CXL symbol with CXL_RAS) which is included
in this port.
CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION: Override debian.master
amd64-only policy to include arm64. Commit 4d873c5 added
'select ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION' to arch/arm64/Kconfig,
making this y on arm64 as well.
CONFIG_GENERIC_CPU_CACHE_MAINTENANCE: New bool config defined by
c460697 in lib/Kconfig. Selected by arm64 via 4d873c5;
not selected by x86. Set arm64: y, amd64: -.
CONFIG_CACHEMAINT_FOR_HOTPLUG: New optional menuconfig defined by
2ec3b54 in drivers/cache/Kconfig. Depends on
GENERIC_CPU_CACHE_MAINTENANCE so becomes visible on arm64. Defaults
to n; HiSilicon HHA driver not needed for NVIDIA platforms.
Set arm64: n, amd64: -.
Signed-off-by: Jiandi An <jan@nvidia.com>
…memory access
Override debian.master policy (m->y) for DEV_DAX, DEV_DAX_CXL, and
DEV_DAX_KMEM to ensure CXL memory regions are accessible as both raw
DAX devices and hotplugged System-RAM nodes.
debian.master sets these to 'm' (modules). For NVIDIA platforms with
CXL Type-2 devices, built-in (y) is required to ensure CXL memory
regions provisioned early in boot are immediately accessible without
relying on module loading order.
CONFIG_DEV_DAX: Override m->y; prerequisite for DEV_DAX_CXL and
DEV_DAX_KMEM to be built-in; depends on
TRANSPARENT_HUGEPAGE (already y in debian.master)
CONFIG_DEV_DAX_CXL: Override m->y; creates /dev/daxX.Y devices for CXL
RAM regions not in the default system memory map
(Soft Reserved or dynamically provisioned regions);
depends on CXL_BUS+CXL_REGION+DEV_DAX (all y)
CONFIG_DEV_DAX_KMEM: Override m->y; onlines CXL DAX devices as System-RAM
NUMA nodes via memory hotplug, making CXL memory
available for normal kernel and userspace allocation
Signed-off-by: Jiandi An <jan@nvidia.com>
…/restore
Add Ubuntu kernel config annotation for CONFIG_PCI_CXL introduced by
the CXL DVSEC and HDM state save/restore series (Srirangan Madhavan).
CONFIG_PCI_CXL: Hidden bool in drivers/pci/Kconfig; auto-enabled when
CXL_BUS=y. Gates compilation of drivers/pci/cxl.o which
saves and restores CXL DVSEC control/range registers and
HDM decoder state across PCI resets and link transitions.
Signed-off-by: Jiandi An <jan@nvidia.com>
2d99890 to
5c70002
Compare
Fixed. |
Thanks, verified with range-diff.
|
|
Just re-adding my ACK from earlier:
|
|
|
Merged and present in https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-nvidia/+git/noble/log/?h=nvidia-6.17-next Closing PR. |
|
The patchscan did post a comment — scroll up to find the The The job failed because of a different issue: commit This is a false positive — the SAUCE config commit isn't a backport of that upstream commit; it just references the upstream SHA for context. No fixes are missing from this PR. This also exposed a UX bug in the workflow: |
Previously, both E: (upstream commit-message mismatch) and W: (missing Fixes: patch) set fixes_found=true, causing the "Missing Fixes Detected" comment to appear even when no Fixes: patches were missing. PR NVIDIA#342 triggered exactly this: a SAUCE config commit referencing an upstream SHA with a different title caused E: output, but All fixes: was empty. Replace the two separate if-blocks (which could overwrite each other via GITHUB_OUTPUT) with a single mutually-exclusive chain: W: / "Fixes for" → fixes_found=true (missing Fixes: patches) E: / non-zero rc → fixes_found=error (upstream verification failure) neither → fixes_found=false (all-clear) Update the "error" PR comment title and body to explain this is typically a false positive from SAUCE commits that reference upstream SHAs in their message body with a different title. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, both E: (upstream commit-message mismatch) and W: (missing Fixes: patch) set fixes_found=true, causing the "Missing Fixes Detected" comment to appear even when no Fixes: patches were missing. PR NVIDIA#342 triggered exactly this: a SAUCE config commit referencing an upstream SHA with a different title caused E: output, but All fixes: was empty. Replace the two separate if-blocks (which could overwrite each other via GITHUB_OUTPUT) with a single mutually-exclusive chain: W: / "Fixes for" → fixes_found=true (missing Fixes: patches) E: / non-zero rc → fixes_found=error (upstream verification failure) neither → fixes_found=false (all-clear) Update the "error" PR comment title and body to explain this is typically a false positive from SAUCE commits that reference upstream SHAs in their message body with a different title. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, both E: (upstream commit-message mismatch) and W: (missing Fixes: patch) set fixes_found=true, causing the "Missing Fixes Detected" comment to appear even when no Fixes: patches were missing. PR NVIDIA#342 triggered exactly this: a SAUCE config commit referencing an upstream SHA with a different title caused E: output, but All fixes: was empty. Replace the two separate if-blocks (which could overwrite each other via GITHUB_OUTPUT) with a single mutually-exclusive chain: W: / "Fixes for" → fixes_found=true (missing Fixes: patches) E: / non-zero rc → fixes_found=error (upstream verification failure) neither → fixes_found=false (all-clear) Update the "error" PR comment title and body to explain this is typically a false positive from SAUCE commits that reference upstream SHAs in their message body with a different title. Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Description
This patch series adds comprehensive CXL (Compute Express Link) support to the
nvidia-6.17 kernel, including:
SmartNICs) to use CXL for coherent memory access via firmware-provisioned
regions
Implements PCIe Port Protocol error handling and logging for CXL Root Ports,
Downstream Switch Ports, and Upstream Switch Ports
registers and HDM decoder programming across PCI resets and link transitions,
enabling device re-initialization after reset for firmware-provisioned
configurations
Sections 8.1.3, 9.6, 9.7) via a sysfs interface for Type-2 devices,
including memory offlining, cache flushing, multi-function sibling
coordination, and DVSEC reset sequencing
interleaving where lower levels use smaller granularities than parent ports
(reverse HPA bit ordering)
upstream
torvalds/mastercovering the range from v6.17.9 to the mergepoint of Terry Bowman's v14 series into v7.0
mapping CXL DAX devices as System-RAM
Key Features Added:
(replacing the old
PCIEAER_CXLsymbol with the newCXL_RASdef_bool)/sys/bus/pci/devices/.../cxl_reset) forType-2 devices with Reset Capable bit set
Function Map DVSEC
cpu_cache_invalidate_memregion()during resetlevels (firmware-provisioned configurations)
DEV_DAX_CXL) and System-RAM mapping(
DEV_DAX_KMEM)ACPI_APEI_EINJ_CXL)Justification
CXL Type-2 device support is critical for next-generation NVIDIA accelerators
and data center workloads:
Source
Patch Breakdown (153 patches + 1 revert):
torvalds/master(v6.17.9 → merge of Terry Bowman v14 into v7.0)torvalds/master(merged fixes + 1 prerequisite)Notes on the upstream cherry-picks (item 2):
The 103 upstream commits span
1bfd0faa78d0(v6.17.9) to0da3050bdded(Merge offor-7.0/cxl-aer-prepintocxl-for-next).This range includes 17 out of 34 patches from Terry Bowman's v14 series
that were reworked by the CXL maintainer and merged into v7.0 via the
for-7.0/cxl-aer-prepbranch. The remaining 17 patches from Terry's v14were refactored into v15 (9 patches, not yet merged) and are not included
in this port.
Notes on the save/restore and reset series (items 6–7):
Srirangan's patches were authored against upstream v7.0-rc1 (which does not
include Alejandro's v23 Type-2 series). For this port, the header
reorganization in patch 2/5 of the save/restore series was adapted to align
with Alejandro's v23 approach: HDM decoder and register map definitions were
moved to
include/cxl/cxl.h(notinclude/cxl/pci.has in the originalpatch) to follow the convention established by Alejandro's series. Upstream
reviewers have indicated that Srirangan's series should be rebased on top of
Alejandro's once it merges.
Notes on the upstream fixes (item 8):
14 upstream commits cherry-picked from
torvalds/masterto fix bugsin the ported commits from items 2 and 6–7. These include 13 fixes
(identified via
Fixes:tags in upstream) plus 1 prerequisite helperfunction (
port_to_host()) required by one of the fixes:822655e6751d0066688dbcdc)88c72bab77aad6602e25819d(extended linear cache)8fdc61faa7304d1608d0ab33(cache Kconfig)521cadb4b69e4d1608d0ab33(cache Kconfig)8441c7d3bd6cb78b9e7b7979+c3dd67681c703e8aaacdad4f4f06d81e7c6a(defer dport)49d1063479134fe516d2ad1a(XOR calculations)77b310bb7b5fd6602e25819d(extended linear cache)0066688dbcdc4f06d81e7c6a(defer dport)318c58852e6829317f8dc6ed(cxl_memdev_attach)0a70b7cd397e2230c4bdc412(locked decoder)9a6a2091324a29317f8dc6ed(cxl_memdev_attach)3bfc213d46754aac11c9a6e7(microchip mfd)27459f86a4374aac11c9a6e7(microchip mfd)Lore Links:
Terry Bowman's CXL RAS series (v14, partially merged into v7.0):
https://lore.kernel.org/all/20260114182055.46029-1-terry.bowman@amd.com/
Smita Koralahalli's CXL EINJ series (v6, patch 3/9 only):
https://lore.kernel.org/linux-cxl/20260210064501.157591-1-Smita.KoralahalliChannabasappa@amd.com/
Alejandro Lucero's CXL Type-2 series (v23):
https://lore.kernel.org/linux-cxl/20260201155438.2664640-1-alejandro.lucero-palau@amd.com/
Robert Richter's multi-level interleaving fix (v1):
https://lore.kernel.org/all/20251028094754.72816-1-rrichter@amd.com/
Srirangan Madhavan's CXL state save/restore series:
https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/
Srirangan Madhavan's CXL reset series (v5):
https://lore.kernel.org/linux-cxl/20260306092322.148765-1-smadhavan@nvidia.com/
Upstream Status:
torvalds/master(v7.0 range)torvalds/masterfor-7.0/cxl-aer-prepTesting
Build Validation:
Config Verification:
CXL-related configs enabled as expected:
Runtime Testing:
ls /sys/bus/cxl/devices/)echo 1 > /sys/bus/pci/devices/<dev>/cxl_reset)Notes
CONFIG_PCIEAER_CXLhas been removed from Kconfig by upstream commitd18f1b7beadf(PCI/AER: Replace PCIEAER_CXL symbol with CXL_RAS).The
debian.masterannotation forPCIEAER_CXL=yis overridden to-in
debian.nvidia-6.17/config/annotations.CONFIG_CXL_BUS,CONFIG_CXL_PCI,CONFIG_CXL_MEM,CONFIG_CXL_PORTremain tristate (not bool) — the v14 series kept them as tristate,
unlike earlier draft versions.
CONFIG_DEV_DAX,CONFIG_DEV_DAX_CXL, andCONFIG_DEV_DAX_KMEMareoverridden from
m(debian.master default) toyto support built-inCXL RAM region DAX access and System-RAM mapping.
CONFIG_PCI_CXLis a new hidden bool introduced by the save/restoreseries; auto-enabled when
CXL_BUS=y. Gates compilation ofdrivers/pci/cxl.ofor DVSEC and HDM state save/restore.CONFIG_GENERIC_CPU_CACHE_MAINTENANCEandCONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGIONare new configsintroduced by the upstream cherry-picks; arm64 auto-selects both.
cpu_cache_invalidate_memregion()is also used by the CXL resetseries for cache flushing during reset.
debian.nvidia-6.17/config/annotationsto reflect all of the above changes.
align with Alejandro's v23 approach (
include/cxl/cxl.hinstead ofinclude/cxl/pci.h). See commit message on patch 2/5 for details.