[linux-6.6.y] efi/cper: Add Zhaoxin CPER error decode support#1701
[linux-6.6.y] efi/cper: Add Zhaoxin CPER error decode support#1701leoliu-oc wants to merge 6 commits into
Conversation
1. Wrap ZDI/ZPI related functions and strings with CONFIG_X86 to restrict the code to X86 architecture only. 2. Remove __maybe_unused attribute from cper_print_proc_generic_zdi_zpi() since the function is now guarded by CONFIG_X86 and always used on X86. 3. Directly use zdi_zpi->responder_id as error type parameter instead of redundant local variable etype. 4. Clean up code formatting for better readability. 5. Replace IS_ENABLED(CONFIG_X86) with #ifdef CONFIG_X86 for consistent preprocessor usage. Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion category: feature -------------------- The Zhaoxin KH-50000 processor's ZDI and ZPI support reporting additional error types with more detailed analysis, hence the addition of KH-50000 ZDI and ZPI error parsing. Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion category: feature -------------------- The Zhaoxin processor's HIF and SVID error types are not defined in the UEFI spec and are reported via a non-standard CPER structure. This patch adds support to decode these proprietary error records. Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion category: feature -------------------- Zhaoxin processors report detailed micro-architectural error classifications by re-purposing the Responder ID and Requestor ID fields. This patch adds support to decode these proprietary error types into human-readable messages. Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion category: feature -------------------- Zhaoxin processors report detailed cache error types by re-purposing the Responder ID field. This patch adds a decoder to translate these numeric values into human-readable strings. Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
zhaoxin inclusion category: feature -------------------- Some memory error types on the Zhaoxin KH-50000 do not fit the are reported by repurposing the Requestor ID field. Add parsing logic to decode this non-standard usage and present human-readable error messages. Signed-off-by: LeoLiu-oc <leoliu-oc@zhaoxin.com>
Reviewer's GuideExtends EFI CPER decoding/printing for Zhaoxin/Centaur x86 CPUs (notably KH-40000 and KH-50000) by refactoring the ZDI/ZPI handler into Zhaoxin-specific helpers, adding microarchitectural/cache/memory error decoding, and wiring in support for new Zhaoxin CPER sections (SVID and HIF) with their associated GUIDs, validation bits, and payload structures. Flow diagram for Zhaoxin CPER processor and memory decode pathsflowchart TD
A[cper_estatus_print_section] -->|proc section| B[cper_print_proc_generic]
A -->|mem section| C[cper_print_mem]
B -->|vendor is ZHAOXIN/CENTAUR and CONFIG_X86| D[cper_print_proc_generic_zx]
C -->|vendor is ZHAOXIN/CENTAUR and CONFIG_X86| E[cper_print_mem_zx]
D -->|proc_error_type 0x1| F[cper_print_proc_generic_zx_cache]
D -->|proc_error_type 0x4| G[cper_print_proc_generic_zdi_zpi]
D -->|proc_error_type 0x8| H[cper_print_proc_generic_zx_micro_arch]
G -->|x86_model 0x5b| I[cper_print_proc_generic_zdi_zpi_kh40000]
G -->|x86_model 0x7b| J[cper_print_proc_generic_zdi_zpi_kh50000]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @leoliu-oc. Thanks for your PR. I'm waiting for a deepin-community member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new logic in
cper_zdi_zpi_err_type_str()now depends onboot_cpu_data.x86_modelfor an exported symbol without checking the vendor; consider gating the model-specific behavior on Zhaoxin/Centaur vendors (falling back to the generic table otherwise) to avoid surprising behavior on other x86 CPUs that might call this helper. - The checks for specific Zhaoxin CPU families/models (e.g.,
boot_cpu_data.x86 == 0x7,x86_model == 0x5b/0x7b) are currently bare literals; introducing named constants or macros for these KH-40000/KH-50000 IDs would improve readability and make future maintenance of additional models clearer.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new logic in `cper_zdi_zpi_err_type_str()` now depends on `boot_cpu_data.x86_model` for an exported symbol without checking the vendor; consider gating the model-specific behavior on Zhaoxin/Centaur vendors (falling back to the generic table otherwise) to avoid surprising behavior on other x86 CPUs that might call this helper.
- The checks for specific Zhaoxin CPU families/models (e.g., `boot_cpu_data.x86 == 0x7`, `x86_model == 0x5b/0x7b`) are currently bare literals; introducing named constants or macros for these KH-40000/KH-50000 IDs would improve readability and make future maintenance of additional models clearer.
## Individual Comments
### Comment 1
<location path="drivers/firmware/efi/cper.c" line_range="286" />
<code_context>
+ if (etype == 0xf) {
+ pr_info("%s general processor error(zpi port 0x%llx error)\n",
+ pfx, zdi_zpi->requestor_id & 0xf);
+ } else if (etype >= 0x0 && etype <= 0xb) {
+ switch (zdi_zpi->requestor_id & 0xf) {
+ case 0x0:
</code_context>
<issue_to_address>
**issue:** The log message for KH50000 ZDI errors prints `etype` where it appears to intend the ZDI port nibble.
In `cper_print_proc_generic_zdi_zpi_kh50000()`, `etype` is `(zdi_zpi->requestor_id & 0xff) >> 4` (error type), while the ZDI port is `(zdi_zpi->requestor_id & 0xf)`. The log currently passes `etype` to the `%x` placeholder labeled as "zdi port", which is misleading. Please pass `(zdi_zpi->requestor_id & 0xf)` instead, or assign it to a local `port` variable and log that for clarity.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if (etype == 0xf) { | ||
| pr_info("%s general processor error(zpi port 0x%llx error)\n", | ||
| pfx, zdi_zpi->requestor_id & 0xf); | ||
| } else if (etype >= 0x0 && etype <= 0xb) { |
There was a problem hiding this comment.
issue: The log message for KH50000 ZDI errors prints etype where it appears to intend the ZDI port nibble.
In cper_print_proc_generic_zdi_zpi_kh50000(), etype is (zdi_zpi->requestor_id & 0xff) >> 4 (error type), while the ZDI port is (zdi_zpi->requestor_id & 0xf). The log currently passes etype to the %x placeholder labeled as "zdi port", which is misleading. Please pass (zdi_zpi->requestor_id & 0xf) instead, or assign it to a local port variable and log that for clarity.
There was a problem hiding this comment.
Pull request overview
Extends the EFI CPER error decoder in drivers/firmware/efi/cper.c with Zhaoxin/Centaur-specific decoding for processor (ZDI/ZPI, micro-architectural CPU/shutdown, cache), memory (KH-50000), and two new non-standard CPER section types (SVID and HIF, including CXL and SNT sub-decoding). Decoding is gated to x86 builds and (in most paths) Zhaoxin/Centaur vendors, with KH-40000 (family 7, model 0x5b) and KH-50000 (family 7, model 0x7b) handled separately.
Changes:
- Adds per-model ZDI/ZPI dispatch (KH-40000 vs KH-50000) and extra error-type strings for KH-50000.
- Adds Zhaoxin micro-architectural CPU/shutdown, cache, and KH-50000 memory error decoding.
- Adds new CPER section GUIDs and structures (
CPER_SEC_SVID,CPER_SEC_HIF) plus decoding/printing routines (including CXL and SNT sub-fields).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| include/linux/cper.h | Adds SVID/HIF section GUIDs, validation-bit defines, and cper_sec_svid / cper_sec_hif structures. |
| drivers/firmware/efi/cper.c | Implements Zhaoxin processor/memory/SVID/HIF decoding, dispatches by family/model, and wires new GUIDs into cper_estatus_print_section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "exit smm mode if the reserved bits in CR4 are writed to 1 in smm mode", | ||
| "exit smm mode if CR4.VMXE bit is writed to 1 in smm mode", | ||
| "exit smm mode if CR4.PCIDE bit is writed to 1 and EFER.LMA bit is writed to 0 in smm mode", | ||
| "software inject an UC error after MCE changed to SMI happened", | ||
| "reserved", | ||
| "MCE happened when CR4 MCE is 0", | ||
| "MCE happened again when didn't clear MCIP in the first MCE handler", |
| "correctable error", | ||
| "uncorrectable error", | ||
| "multi correctable error", | ||
| "multi Uncorrectable error", |
| if (etype == 0xf) { | ||
| pr_info("%s general processor error(zpi port 0x%llx error)\n", | ||
| pfx, zdi_zpi->requestor_id & 0xf); | ||
| } else if (etype >= 0x0 && etype <= 0xb) { |
| break; | ||
| case 0x7b: | ||
| cper_print_proc_generic_zdi_zpi_kh50000(pfx, zdi_zpi); | ||
| break; | ||
| default: | ||
| return; | ||
| } | ||
| } |
| const char *cper_zdi_zpi_err_type_str(unsigned int etype) | ||
| { | ||
| switch (boot_cpu_data.x86_model) { | ||
| case 0x5b: | ||
| if (etype >= 0x13) | ||
| return "unknown error"; | ||
| break; | ||
| case 0x7b: | ||
| if (etype == 0x6 || (etype >= 0xb && etype <= 0x12)) | ||
| return "unknown error"; | ||
| break; | ||
| default: | ||
| return "unknown error"; | ||
| } | ||
| return etype < ARRAY_SIZE(zdi_zpi_err_type_strs) ? | ||
| zdi_zpi_err_type_strs[etype] : "unknown error"; | ||
| zdi_zpi_err_type_strs[etype] : | ||
| "unknown error"; | ||
| } | ||
| EXPORT_SYMBOL_GPL(cper_zdi_zpi_err_type_str); |
| struct cper_sec_svid { | ||
| u8 validation_bits; | ||
| u8 socket_id; | ||
| u8 svid_id; | ||
| u8 vrm_number; | ||
| u16 error_type; | ||
| u16 reserved; | ||
| }; | ||
|
|
||
| struct cper_sec_hif { | ||
| u16 validation_bits; | ||
| u8 socket_id; | ||
| u8 hnod_id; | ||
| u8 snt_location; | ||
| u8 snt_error_type; | ||
| u8 snt_error_data[5]; | ||
| u8 snt_error_addr[5]; | ||
| u64 dvad_error_addr[6]; | ||
| u64 cxl_decode_error_addr[2]; | ||
| u8 cxl_error_type[2]; | ||
| }; |
| } else if (guid_equal(sec_type, &CPER_SEC_SVID)) { | ||
| struct cper_sec_svid *svid_err = acpi_hest_get_payload(gdata); | ||
|
|
||
| printk("%ssection_type: SVID Error\n", newpfx); | ||
| if (gdata->error_data_length >= sizeof(*svid_err)) | ||
| cper_print_svid_err(newpfx, svid_err); | ||
| else | ||
| goto err_section_too_small; | ||
| } else if (guid_equal(sec_type, &CPER_SEC_HIF)) { | ||
| struct cper_sec_hif *hif_err = acpi_hest_get_payload(gdata); | ||
|
|
||
| printk("%ssection_type: HIF Error\n", newpfx); | ||
| if (gdata->error_data_length >= sizeof(*hif_err)) | ||
| cper_print_hif_err(newpfx, hif_err); | ||
| else | ||
| goto err_section_too_small; |
This patch series extends EFI CPER error reporting for Zhaoxin/Centaur x86 processors, with specific support for the KH-50000 family.
The series achieves the following:
The new logic is restricted to x86 builds and only activates for Zhaoxin/Centaur vendors.
Patch queue:
0001-efi-cper-Refactor-Zhaoxin-ZDI-ZPI-error-print-code.patch
0002-efi-cper-Add-Zhaoxin-ZDI-ZPI-error-decode-for-KH-50000.patch
0003-efi-cper-Add-Zhaoxin-non-standard-cper-error-decode.patch
0004-efi-cper-Add-Zhaoxin-micro-architectural-error-decod.patch
0005-efi-cper-Add-Zhaoxin-cache-error-decode.patch
0006-efi-cper-Add-Zhaoxin-mem-error-decode-for-KH-50000.patch
Summary by Sourcery
Extend EFI CPER error reporting with Zhaoxin/Centaur-specific decoding for processor, memory, and new vendor sections.
New Features:
Enhancements: