[linux-nvidia-6.18-next] PCI: mirror PI7C9X3G606GPC Port 4 BAR0#447
Open
nirmoy wants to merge 187 commits into
Open
[linux-nvidia-6.18-next] PCI: mirror PI7C9X3G606GPC Port 4 BAR0#447nirmoy wants to merge 187 commits into
nirmoy wants to merge 187 commits into
Conversation
Introduce a generic macro TEGRA_GPIO_PORT to define SoC specific ports macros. This simplifies the code and avoids unnecessary duplication. Suggested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> (cherry picked from commit f75db6f) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Extend the existing Tegra186 GPIO controller driver with support for the GPIO controller found on Tegra410. Tegra410 supports two GPIO controllers referred to as 'COMPUTE' and 'SYSTEM'. Co-developed-by: Nathan Hartman <nhartman@nvidia.com> Signed-off-by: Nathan Hartman <nhartman@nvidia.com> Signed-off-by: Prathamesh Shete <pshete@nvidia.com> Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> (cherry picked from commit 9631a10) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
On Tegra410, Compute and System GPIOs have same port names. This results in the same GPIO names for both Compute and System GPIOs during initialization in `tegra186_gpio_probe()`, which results in following warnings: kernel: gpio gpiochip1: Detected name collision for GPIO name 'PA.00' kernel: gpio gpiochip1: Detected name collision for GPIO name 'PA.01' kernel: gpio gpiochip1: Detected name collision for GPIO name 'PA.02' kernel: gpio gpiochip1: Detected name collision for GPIO name 'PB.00' kernel: gpio gpiochip1: Detected name collision for GPIO name 'PB.01' ... Add GPIO name prefix in the SoC data and use it to initialize the GPIO name. Port names remain unchanged for previous SoCs. On Tegra410, Compute GPIOs are named COMPUTE-P<PORT>.GPIO, and System GPIOs are named SYSTEM-P<PORT>.GPIO. Fixes: 9631a10 ("gpio: tegra186: Add support for Tegra410") Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20251113163112.885900-1-kkartik@nvidia.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> (cherry picked from commit 67f9b82) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Tegra410 and Tegra241 have deprecated HIDREV register. It is recommended to use ARM SMCCC calls to get chip_id, major and minor revisions. Use ARM SMCCC to get chip_id, major and minor revision. Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Extract common cleanup code into dedicated helper functions to simplify the code and improve readability. This refactoring includes: - tegra_qspi_reset(): Device reset and interrupt cleanup - tegra_qspi_dma_stop(): DMA termination and disable - tegra_qspi_pio_stop(): PIO mode disable No functional changes. This is purely a code reorganization to prepare for improved timeout handling in subsequent patches. Signed-off-by: Vishwaroop A <va@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20251028155703.4151791-3-va@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 6022eac) Signed-off-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Under high system load, QSPI interrupts can be delayed or blocked on the target CPU, causing wait_for_completion_timeout() to report failure even though the hardware successfully completed the transfer. When a timeout occurs, check the QSPI_RDY bit in QSPI_TRANS_STATUS to determine if the hardware actually completed the transfer. If so, manually invoke the completion handler to process the transfer successfully instead of failing it. This distinguishes lost/delayed interrupts from real hardware timeouts, preventing unnecessary failures of transfers that completed successfully. Signed-off-by: Vishwaroop A <va@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20251028155703.4151791-4-va@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> (cherry picked from commit 380fd29) Signed-off-by: Carol L Soto <csoto@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
When APEI fails to handle a stage-2 synchronous external abort (SEA),
today KVM injects an asynchronous SError to the VCPU then resumes it,
which usually results in unpleasant guest kernel panic.
One major situation of guest SEA is when vCPU consumes recoverable
uncorrected memory error (UER). Although SError and guest kernel panic
effectively stops the propagation of corrupted memory, guest may
re-use the corrupted memory if auto-rebooted; in worse case, guest
boot may run into poisoned memory. So there is room to recover from
an UER in a more graceful manner.
Alternatively KVM can redirect the synchronous SEA event to VMM to
- Reduce blast radius if possible. VMM can inject a SEA to VCPU via
KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison
consumption or fault is not from guest kernel, blast radius can be
limited to the triggering thread in guest userspace, so VM can
keep running.
- Allow VMM to protect from future memory poison consumption by
unmapping the page from stage-2, or to interrupt guest of the
poisoned page so guest kernel can unmap it from stage-1 page table.
- Allow VMM to track SEA events that VM customers care about, to restart
VM when certain number of distinct poison events have happened,
to provide observability to customers in log management UI.
Introduce an userspace-visible feature to enable VMM handle SEA:
- KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior
when host APEI fails to claim a SEA, userspace can opt in this new
capability to let KVM exit to userspace during SEA if it is not
owned by host.
- KVM_EXIT_ARM_SEA. A new exit reason is introduced for this.
KVM fills kvm_run.arm_sea with as much as possible information about
the SEA, enabling VMM to emulate SEA to guest by itself.
- Sanitized ESR_EL2. The general rule is to keep only the bits
useful for userspace and relevant to guest memory.
- Flags indicating if faulting guest physical address is valid.
- Faulting guest physical and virtual addresses if valid.
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Co-developed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://msgid.link/20251013185903.1372553-2-jiaqiyan@google.com
Signed-off-by: Oliver Upton <oupton@kernel.org>
(cherry picked from commit ad9c62b)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Test how KVM handles guest SEA when APEI is unable to claim it, and KVM_CAP_ARM_SEA_TO_USER is enabled. The behavior is triggered by consuming recoverable memory error (UER) injected via EINJ. The test asserts two major things: 1. KVM returns to userspace with KVM_EXIT_ARM_SEA exit reason, and has provided expected fault information, e.g. esr, flags, gva, gpa. 2. Userspace is able to handle KVM_EXIT_ARM_SEA by injecting SEA to guest and KVM injects expected SEA into the VCPU. Tested on a data center server running Siryn AmpereOne processor that has RAS support. Several things to notice before attempting to run this selftest: - The test relies on EINJ support in both firmware and kernel to inject UER. Otherwise the test will be skipped. - The under-test platform's APEI should be unable to claim the SEA. Otherwise the test will be skipped. - Some platform doesn't support notrigger in EINJ, which may cause APEI and GHES to offline the memory before guest can consume injected UER, and making test unable to trigger SEA. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Link: https://msgid.link/20251013185903.1372553-3-jiaqiyan@google.com Signed-off-by: Oliver Upton <oupton@kernel.org> (cherry picked from commit feee9ef) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L. Soto <csoto@nvidia.com> Acked-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Document the new userspace-visible features and APIs for handling synchronous external abort (SEA) - KVM_CAP_ARM_SEA_TO_USER: How userspace enables the new feature. - KVM_EXIT_ARM_SEA: exit userspace gets when it needs to handle SEA and what userspace gets while taking the SEA. Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> Link: https://msgid.link/20251013185903.1372553-4-jiaqiyan@google.com [ oliver: make documentation concise, remove implementation detail ] Signed-off-by: Oliver Upton <oupton@kernel.org> (cherry picked from commit 4debb5e) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L. Soto <csoto@nvidia.com> Acked-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
The main use of {LD,ST}64B* is to talk to a device, which is hopefully
directly assigned to the guest and requires no additional handling.
However, this does not preclude a VMM from exposing a virtual device
to the guest, and to allow 64 byte accesses as part of the programming
interface. A direct consequence of this is that we need to be able
to forward such access to userspace.
Given that such a contraption is very unlikely to ever exist, we choose
to offer a limited service: userspace gets (as part of a new exit reason)
the ESR, the IPA, and that's it. It is fully expected to handle the full
semantics of the instructions, deal with ACCDATA, the return values and
increment PC. Much fun.
A canonical implementation can also simply inject an abort and be done
with it. Frankly, don't try to do anything else unless you have time
to waste.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit f174a9f linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add a bit of documentation for KVM_EXIT_ARM_LDST64B so that userspace knows what to expect. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 902eeba linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L. Soto <csoto@nvidia.com> Acked-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
…emory If FEAT_LS64WB not supported, FEAT_LS64* instructions only support to access Device/Uncacheable memory, otherwise a data abort for unsupported Exclusive or atomic access (0x35, UAoEF) is generated per spec. It's implementation defined whether the target exception level is routed and is possible to implemented as route to EL2 on a VHE VM according to DDI0487L.b Section C3.2.6 Single-copy atomic 64-byte load/store. If it's implemented as generate the DABT to the final enabled stage (stage-2), inject the UAoEF back to the guest after checking the memslot is valid. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 2937aee linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L. Soto <csoto@nvidia.com> Acked-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Instructions introduced by FEAT_{LS64, LS64_V} is controlled by
HCRX_EL2.{EnALS, EnASR}. Configure all of these to allow usage
at EL0/1.
This doesn't mean these instructions are always available in
EL0/1 if provided. The hypervisor still have the control at
runtime.
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit dea58da linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Using FEAT_{LS64, LS64_V} instructions in a guest is also controlled
by HCRX_EL2.{EnALS, EnASR}. Enable it if guest has related feature.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 151b92c linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Armv8.7 introduces single-copy atomic 64-byte loads and stores
instructions and its variants named under FEAT_{LS64, LS64_V}.
These features are identified by ID_AA64ISAR1_EL1.LS64 and the
use of such instructions in userspace (EL0) can be trapped.
As st64bv (FEAT_LS64_V) and st64bv0 (FEAT_LS64_ACCDATA) can not be tell
apart, FEAT_LS64 and FEAT_LS64_ACCDATA which will be supported in later
patch will be exported to userspace, FEAT_LS64_V will be enabled only
in kernel.
In order to support the use of corresponding instructions in userspace:
- Make ID_AA64ISAR1_EL1.LS64 visbile to userspace
- Add identifying and enabling in the cpufeature list
- Expose these support of these features to userspace through HWCAP3
and cpuinfo
ld64b/st64b (FEAT_LS64) and st64bv (FEAT_LS64_V) is intended for
special memory (device memory) so requires support by the CPU, system
and target memory location (device that support these instructions).
The HWCAP3_LS64, implies the support of CPU and system (since no
identification method from system, so SoC vendors should advertise
support in the CPU if system also support them).
Otherwise for ld64b/st64b the atomicity may not be guaranteed or a
DABT will be generated, so users (probably userspace driver developer)
should make sure the target memory (device) also have the support.
For st64bv 0xffffffffffffffff will be returned as status result for
unsupported memory so user should check it.
Document the restrictions along with HWCAP3_LS64.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(backported from commit 58ce786 linux-next)
[mochs: Minor context cleanup due to lack of "arm64: Detect FEAT_XNX"]
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add tests for FEAT_LS64. Issue related instructions if feature presents, no SIGILL should be received. When such instructions operate on Device memory or non-cacheable memory, we may received a SIGBUS during the test (w/o FEAT_LS64WB). Just ignore it since we only tested whether the instruction itself can be issued as expected on platforms declaring the support of such features. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 57a9635 linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Acked-by: Carol L. Soto <csoto@nvidia.com> Acked-by: Nirmoy Das <nirmoyd@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
…evice-nGnRE
Add CONFIG_ARM64_WORKAROUND_NC_TO_NGNRE configuration option that
enables conversion of MT_NORMAL_NC (Normal Non-Cacheable) memory
attribute to Device-nGnRE memory type in MAIR_EL1 for hardware that
requires stricter memory ordering or has issues with Non-Cacheable
memory mappings.
Key changes:
1. New memory type MT_NORMAL_NC_DMA (Attr5):
- Introduced specifically for DMA coherent memory mappings
- Configured with the same Normal Non-Cacheable attribute (0x44)
as MT_NORMAL_NC (Attr2) by default
- pgprot_dmacoherent uses MT_NORMAL_NC_DMA when workaround is
enabled, MT_NORMAL_NC otherwise
2. MAIR_EL1 conversion via alternatives framework:
- arch/arm64/mm/proc.S uses ARM64 alternatives to patch MAIR_EL1
during early boot
- Converts MT_NORMAL_NC (Attr2) from 0x44 to 0x04 (Device-nGnRE)
using efficient bfi instruction
- MT_NORMAL_NC_DMA (Attr5) keeps the same attribute value as
MT_NORMAL_NC originally had
- Zero performance overhead when workaround is disabled
3. Boot-time configuration:
- Enabled via kernel command line: mair_el1_nc_to_ngnre=1
- Boot CPU fixup in enable_nc_to_ngnre() applies conversion before
alternatives are patched
- Secondary CPUs automatically use patched alternatives in
__cpu_setup
- Runtime changes not supported as alternatives cannot be
re-patched after boot
4. Errata framework integration:
- Registered in arm64_errata[] array as ARM64_WORKAROUND_NC_TO_NGNRE
- Capability type: ARM64_CPUCAP_BOOT_CPU_FEATURE
- Uses cpucap_is_possible() for build-time capability checking
The workaround preserves pgprot_dmacoherent behavior while allowing
MT_NORMAL_NC to be converted to Device memory type for other mappings
that may be affected by hardware issues. Ensure NC memory attribute
assignment is prevented for passthrough device MMIO regions.
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Acked-by: Carol L. Soto <csoto@nvidia.com>
Acked-by: Nirmoy Das <nirmoyd@nvidia.com>
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
CPU_CYCLES is expected to count the logical CPU (PE) clock. Currently it's
preferred to use PMCCNTR_EL0 for counting CPU_CYCLES, but it'll count
processor clock rather than the PE clock (ARM DDI0487 L.b D13.1.3) if
one of the SMT siblings is not idle on a multi-threaded implementation.
So don't use it on SMT cores.
Introduce topology_core_has_smt() for knowing the SMT implementation and
cached it in arm_pmu::has_smt during allocation.
When counting cycles on SMT CPU 2-3 and CPU 3 is idle, without this
patch we'll get:
[root@client1 tmp]# perf stat -e cycles -A -C 2-3 -- stress-ng -c 1
--taskset 2 --timeout 1
[...]
Performance counter stats for 'CPU(s) 2-3':
CPU2 2880457316 cycles
CPU3 2880459810 cycles
1.254688470 seconds time elapsed
With this patch the idle state of CPU3 is observed as expected:
[root@client1 ~]# perf stat -e cycles -A -C 2-3 -- stress-ng -c 1
--taskset 2 --timeout 1
[...]
Performance counter stats for 'CPU(s) 2-3':
CPU2 2558580492 cycles
CPU3 305749 cycles
1.113626410 seconds time elapsed
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit c3d78c3)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Acked-by: Carol L Soto <csoto@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…ERIC_ARCH_TOPOLOGY The arm_pmu driver is using topology_core_has_smt() for retrieving the SMT implementation which depends on CONFIG_GENERIC_ARCH_TOPOLOGY. The config is optional on arm platforms so provide a !CONFIG_GENERIC_ARCH_TOPOLOGY stub for topology_core_has_smt(). Fixes: c3d78c3 ("perf: arm_pmuv3: Don't use PMCCNTR_EL0 on SMT cores") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511041757.vuCGOmFc-lkp@intel.com/ Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Yicong Yang <yangyccccc@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 7ab06ea) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add the part number and MIDR for NVIDIA Olympus. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> (cherry picked from commit d5e4c71) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add NVIDIA Olympus MIDR to neoverse_spe range list. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> (backported from commit d852b83) [mochs: Minor context cleanup due to absence of "perf arm_spe: Add CPU variants supporting common data source packet"] Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Implementer may need to reset a filter config when stopping a counter, thus adding a callback for this. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit a2573bc) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
The PMIIDR value is composed by the values in PMPIDR registers. We can use PMPIDR registers as alternative for device identification for systems that do not implement PMIIDR. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 04330be) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Distinguish NVIDIA devices by revision and variant bits in PMIIDR register in addition to product id. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 82dfd72) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Support NVIDIA PMU that utilizes the optional event filter2 register. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit decc368) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
The documentation in nvidia-pmu.rst contains PMUs specific to NVIDIA Tegra241 SoC. Rename the file for this specific SoC to have better distinction with other NVIDIA SoC. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> (backported from https://lore.kernel.org/all/20260126181155.2776097-1-bwicaksono@nvidia.com/) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Adds Unified Coherent Fabric PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> (backported from https://lore.kernel.org/all/20260126181155.2776097-1-bwicaksono@nvidia.com/) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add interface to get ACPI device associated with the PMU. This ACPI device may contain additional properties not covered by the standard properties. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> (backported from https://lore.kernel.org/all/20260126181155.2776097-1-bwicaksono@nvidia.com/) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Adds PCIE PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> (backported from https://lore.kernel.org/all/20260126181155.2776097-1-bwicaksono@nvidia.com/) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Adds PCIE-TGT PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> (backported from https://lore.kernel.org/all/20260126181155.2776097-1-bwicaksono@nvidia.com/) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com> Acked-by: Carol L Soto <csoto@nvidia.com> Acked-by: Jamie Nguyen <jamien@nvidia.com> Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…cy PMU" This reverts commit e0ab9dd. Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
…TGT PMU" This reverts commit 9361be0. Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
…PMU" This reverts commit 839af7d. Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
This reverts commit bb2ae52. Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
This reverts commit 4a814e6. Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
…a241" This reverts commit f2a7f36. Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
The documentation in nvidia-pmu.rst contains PMUs specific to NVIDIA Tegra241 SoC. Rename the file for this specific SoC to have better distinction with other NVIDIA SoC. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit d332424) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
The Unified Coherence Fabric (UCF) contains last level cache and cache coherent interconnect in Tegra410 SOC. The PMU in this device can be used to capture events related to access to the last level cache and memory from different sources. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit f5caf26) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add interface to get ACPI device associated with the PMU. This ACPI device may contain additional properties not covered by the standard properties. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit bc86281) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Adds PCIE PMU support in Tegra410 SOC. This PMU is instanced in each root complex in the SOC and can capture traffic from PCIE device to various memory types. This PMU can filter traffic based on the originating root port or BDF and the target memory types (CPU DRAM, GPU Memory, CXL Memory, or remote Memory). Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit bf585ba) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Adds PCIE-TGT PMU support in Tegra410 SOC. This PMU is instanced in each root complex in the SOC and it captures traffic originating from any source towards PCIE BAR and CXL HDM range. The traffic can be filtered based on the destination root port or target address range. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 3dd7302) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Adds CPU Memory (CMEM) Latency PMU support in Tegra410 SOC. The PMU is used to measure latency between the edge of the Unified Coherence Fabric to the local system DRAM. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 429b763) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Adds NVIDIA C2C PMU support in Tegra410 SOC. This PMU is used to measure memory latency between the SOC and device memory, e.g GPU Memory (GMEM), CXL Memory, or memory on remote Tegra410 SOC. Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit 2f89b7f) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add JSON files for NVIDIA Tegra410 Olympus core PMU events. Also updated the common-and-microarch.json. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> (cherry picked from commit 86ff690) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
PMCCNTR_EL0 may continue to increment on NVIDIA Olympus CPUs while the PE is in WFI/WFE. That does not necessarily match the CPU_CYCLES event counted by a programmable counter, so using PMCCNTR_EL0 for cycles can give results that differ from the programmable counter path. Extend the existing PMCCNTR avoidance decision from the SMT case to also cover Olympus. Store the result in the common arm_pmu state at registration time, so arm_pmuv3 can keep using a single flag when deciding whether CPU_CYCLES may use PMCCNTR_EL0. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> (cherry picked from https://lore.kernel.org/all/20260504175204.3122979-1-bwicaksono@nvidia.com/) Signed-off-by: Lee Trager <ltrager@nvidia.com> Acked-by: Seth Forshee <sforshee@nvidia.com> Acked-by: Matthew R. Ochs <mochs@nvidia.com> Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Collaborator
Author
Boro reviewSummaryNo issues found across the reviewed commits. Findings: no problems found Latest watcher review: open review Kernel deb build: successful (download debs, 4 files) Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
55587b6 to
7f1b4f6
Compare
8654dc2 to
4e33789
Compare
Some Pericom/Diodes PI7C9X3G606GPC switches require downstream Port 4 BAR0 to mirror BAR0 of the immediate upstream port. Firmware may apply this during boot, but Linux PCI resource assignment can move the upstream BAR0 and leave Port 4 without the required mirror. Diodes confirmed that Tile0/P4 is OS-visible as device 04, function 0 on the bus below the upstream port. Add a final and early resume quirk for that downstream function. The quirk verifies that the immediate upstream bridge is the same switch, then writes Port 4 BAR0 from the upstream BAR0 after resource assignment and during early resume. Port 4 BAR0 may read back as zero even after a successful write, so the write must be validated by platform-specific means. Change-Id: I01079d67c4f665da5162180db929e2fe43d64ac2 Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
4e33789 to
b8a7745
Compare
arighi
approved these changes
Jun 1, 2026
Collaborator
arighi
left a comment
There was a problem hiding this comment.
LGTM, thanks!
Acked-by: Andrea Righi <arighi@nvidia.com>
a2af04d to
295622c
Compare
Collaborator
|
I have a similar question about 64-bit bars as was raised on the PR for 6.17. Seems that BAR0 is expected to be 32-bit but the 64-bit BAR check was removed. I'd like to see the resolution to this question before merging. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Validation
b8a7745cab1c3c7be7b25c2dbba0f038662b60f6.scripts/checkpatch.pl --strict --ignore GERRIT_CHANGE_ID --git HEADgit diff --check HEAD~1..HEADmake O=/tmp/nv-kernels-pr447-quirks-build -j$(nproc) drivers/pci/quirks.omake bindeb-pkgafter removing the 64-bit upstream BAR0 skip.172.17.33.143via jumper10.22.18.250; BMC172.17.33.144.journalctl -b -kscan forBTRFS error|I/O error|nvme.*timeout|device inaccessible|read-only|blk_update_request|Buffer I/O errorreturned no matches.mods/4.31.0on test kernels, so DKMS hooks were temporarily bypassed only to finish package configuration/initramfs/grub; hooks were restored afterward.References
Launchpad: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2154457
6.17 PR: #442
BOS PR: #443
NVBug: https://nvbugspro.nvidia.com/bug/6205517
NVBug: https://nvbugspro.nvidia.com/bug/6134331