Issue Description
We have encountered stability and boot-time issues when deploying Linux kernels (v5.18 and later), while it works well on the v5.10 kernel on platforms based on legacy 600/900 series RTL. These are identified as compatibility issues arising from new kernel features that assume newer hardware capabilities, which are not supported by older RTL versions.
We have identified two specific triggers and successfully implemented workarounds for both. We would like to share these
findings and confirm if this is the recommended approach.
Technical Details
-
SATP Register Probing
- Context: Applicable to 600/900 series RTL (versions prior to 2.4.0, approx. July 2022).
- Symptom: System hangs silently during early boot. Debugging shows scause = 0xc (Page Fault).
- Root Cause: Linux kernel attempts to probe hardware support for Sv48/Sv57 paging by writing to the satp register.
The legacy RTL fails to handle these probe-related write attempts, resulting in a Page Fault before the console is
initialized.
- Proposed Workaround: Patch the kernel to disable Sv48/Sv57 probing at initialization.
-
ASID-based TLB Flush
- Context: Applicable to 600/900 series RTL (versions prior to 3.1.0, approx. July 2023).
- Symptom: Random Page Faults or memory inconsistency issues during init process.
- Root Cause: Linux kernel introduced ASID-based TLB flushing for performance optimization. Legacy RTL lacks
hardware support for this specific mechanism, leading to incorrect TLB state after a flush attempt.
- Proposed Workaround: Force the kernel to use a more generic TLB flush mechanism (full flush) instead of ASID-specific
flush routines.
Proposed Workaround Patches
The following patches are maded based on v6.6 https://github.com/Nuclei-Software/linux/tree/44ef5e81b4b5f9d7c71aee1c927819935eceb394
Workaround for Issue 1 (arch/riscv/mm/init.c)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index bdf8ac6c7e30..b1e9d08949a7 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1106,7 +1106,16 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
#endif
#if defined(CONFIG_64BIT) && !defined(CONFIG_XIP_KERNEL)
- set_satp_mode(dtb_pa);
+ /*
+ * Applicable to Nuclei 600/900 series RTL (versions prior to 2.4.0,
+ * approx. July 2022): Older CPU RTL fails to handle SATP mode probing
+ * for Sv48/Sv57 paging, resulting in a Page Fault before console
+ * initialization. Force Sv39 (3-level paging) to avoid the probe.
+ * Newer RTL versions (>= 2.4.0) handle SATP writes properly and can
+ * enable Sv48/Sv57 if desired.
+ */
+ disable_pgtable_l5();
+ disable_pgtable_l4();
#endif
Workaround for Issue 2 (arch/riscv/mm/tlbflush.c)
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c
index ba8eb3944687..120d46a42bf8 100644
--- a/arch/riscv/mm/context.c
+++ b/arch/riscv/mm/context.c
@@ -244,6 +244,16 @@ static int __init asids_init(void)
*/
local_flush_tlb_all();
+ /*
+ * Applicable to Nuclei 600/900 series RTL (versions prior to 3.1.0,
+ * approx. July 2023): Older CPU RTL does not support ASID-tagged TLB
+ * flush operations, even though SATP.ASID field accepts writes.
+ * Force disable ASID allocator to always flush TLB on context switch.
+ * Newer RTL versions (>= 3.1.0) support ASID properly and do not need
+ * this workaround.
+ */
+ asid_bits = 0;
+
/* Pre-compute ASID details */
if (asid_bits) {
num_asids = 1 << asid_bits;
Issue Description
We have encountered stability and boot-time issues when deploying Linux kernels (v5.18 and later), while it works well on the v5.10 kernel on platforms based on legacy 600/900 series RTL. These are identified as compatibility issues arising from new kernel features that assume newer hardware capabilities, which are not supported by older RTL versions.
We have identified two specific triggers and successfully implemented workarounds for both. We would like to share these
findings and confirm if this is the recommended approach.
Technical Details
SATP Register Probing
The legacy RTL fails to handle these probe-related write attempts, resulting in a Page Fault before the console is
initialized.
ASID-based TLB Flush
hardware support for this specific mechanism, leading to incorrect TLB state after a flush attempt.
flush routines.
Proposed Workaround Patches
Workaround for Issue 1 (arch/riscv/mm/init.c)
Workaround for Issue 2 (arch/riscv/mm/tlbflush.c)