Optimize stack management, fix preemption, and add automated QEMU testing for RISC-V32 port#513
Optimize stack management, fix preemption, and add automated QEMU testing for RISC-V32 port#513Winstonllllai wants to merge 11 commits intoeclipse-threadx:masterfrom
Conversation
Optimized system reliability and interrupt latency by implementing wfi in the idle loop and enforcing 16-byte alignment for the system stack. Details: 1. Reliability (WFI) Implementation: Modified tx_thread_schedule.S to include the wfi (Wait For Interrupt) instruction in the idle loop. The scheduler now enters low-power sleep mode when no threads are ready, instead of busy-waiting, reducing significant power consumption. 2. Interrupt Latency (Shadow Stack Alignment): Updated tx_initialize_low_level.S to enforce 16-byte alignment on the system stack pointer (sp) before saving it to _tx_thread_system_stack_ptr. Ensures strict adherence to the RISC-V ABI and prevents misaligned access faults or performance penalties during context switches to the interrupt stack.
- Add Python-based test runner automating QEMU launch and GDB control. - Verifies Context Switching, FPU Context Preservation, and Timer Interrupts. - Implements robust cleanup using try-finally blocks to prevent orphaned QEMU processes. - Update CMake configuration to include the `check-functional-riscv32` target. - Add floating-point arithmetic test logic to thread entry function. - Add global pointer definition to linker script to resolve relocation errors.
- Enabled `-mrelax` compiler flag in CMake configuration. - Updated entry.s to correctly initialize the Global Pointer (`gp`) using the `__global_pointer$` symbol. - Verified that `gp` is treated as a constant and no longer saved/restored during context switches, improving performance.
- Implement symbolic stack offsets for RISC-V 32-bit GNU port. - Harmonize context restoration across all scheduling paths. - Clean up assembly files within the risc-v32/gnu source directory.
Standardized stack frame access using sp instead of t0 in tx_thread_context_restore.S. Fixed global pointer address loading using la for _tx_thread_current_ptr and others. Implemented dynamic mstatus recovery from stack to preserve FPU and interrupt states. Optimized register storage sequence in the preemption path for better atomicity. Adjusted thread priorities in demo_threadx.c to facilitate preemption verification.
|
Hi @Winstonllllai! Thank you for this contribution. I will ask someone to review it and provide feedback. Since most senior team members were at Embedded World this week, we expect our feedback to be delayed. Before we can accept your code, you need to sign the Eclipse Contributor Agreement (ECA). The purpose of the ECA is to provide a written record that you have agreed to provide your code and documentation contributions under the licenses used by the Eclipse ThreadX project. It also makes it clear that you are promising that what you are contributing to Eclipse is code you wrote, and you have the necessary rights to contribute it to our projects. And finally, it documents a commitment from you that your open source contributions will be permanently on the public record. Signing the ECA requires an Eclipse Foundation account if you do not already have one. You can create one for free at https://accounts.eclipse.org. Be sure to use the same email address when you register for the account that you intend to use on Git commit records. Also, please add your GitHub ID to your Eclipse account. This enables synchronisation between Eclipse-owned infrastructure and GitHub. Here is the link to sign the ECA: |
|
Hi @fdesbiens , |
|
Hi @Winstonllllai. Thank you. The ECA check passes. I will ask a team member to have a look next week (March 23), when I am back from vacation. |
|
Hi @fdesbiens , |
|
@akifejaz Would you mind taking a look, please? |
There was a problem hiding this comment.
Hi @Winstonllllai, Thanks for the PR. Firstly, I'm little late in review, I apologize for that.
Here is my feedback on the PR.
-- we should not be using the tx_port.h in any of the .S file
-- hardcoded values in src/*.S are fine you can revert those back
-- can you please add more comments in entry.s & link.lds, the changes there are not justified.
-- almost all the files needs cleanup in terms of comments, alignment etc, you can refer to ThreadX coding standards for more details.
-- all changes in tx_port.h seems irrelevant you can revert these.
-- lets move the build instructions from project root CMakeLists.txt to port level CMakeLists.txt
moreover, I also added inline comments in some of the files, I didn't reviewed rest of the files/changes, but they also seem to have similar issues. Let me know if you get chance to cleanup the PR, and I'm happy to review it again.
Thanks!
| tx_thread_create(&thread_2, "thread 2", thread_2_entry, 2, | ||
| pointer, DEMO_STACK_SIZE, | ||
| 16, 16, 4, TX_AUTO_START); | ||
| 10, 10, 4, TX_AUTO_START); |
| } | ||
| } | ||
|
|
||
| float fpu_test_val = 0.0f; |
There was a problem hiding this comment.
this can be moved to top after other #defines.
| .option push | ||
| .option norelax | ||
| la gp, __global_pointer$ | ||
| .option pop |
There was a problem hiding this comment.
x3 register is missing. also I'm not sure why we need to change .text to .init.
| #if defined(__riscv_flen) && ((__riscv_flen == 32)||(__riscv_flen == 64)) | ||
| addi sp, sp, -65*REGBYTES // Allocate space for all registers - with floating point enabled | ||
| #else | ||
| addi sp, sp, -128 // Allocate space for all registers - without floating point enabled (32*4) | ||
| addi sp, sp, -32*REGBYTES // Allocate space for all registers - without floating point enabled | ||
| #endif | ||
|
|
||
| sw x1, 112(sp) // Store RA (28*4 = 112, because call will override ra [ra is a callee register in riscv]) | ||
| STORE x1, 28*REGBYTES(sp) // Store RA, 28*REGBYTES(because call will override ra [ra is a calle register in riscv]) |
There was a problem hiding this comment.
Here we intentionally used the hardcoded numbers, so we do not have to include .h file in .S files.
| #ifdef __ASSEMBLER__ | ||
|
|
||
|
|
||
| #if __riscv_xlen == 64 | ||
| # define SLL32 sllw | ||
| # define STORE sd | ||
| # define LOAD ld | ||
| # define LWU lwu | ||
| # define LOG_REGBYTES 3 | ||
| #else | ||
| # define SLL32 sll | ||
| # define STORE sw | ||
| # define LOAD lw | ||
| # define LWU lw | ||
| # define LOG_REGBYTES 2 | ||
| #endif | ||
| #define REGBYTES (1 << LOG_REGBYTES) | ||
|
|
||
| /* Define stack frame offsets for thread context save/restore. |
There was a problem hiding this comment.
not recommended as per ThreadX coding standards.
| la t0, _tx_thread_current_ptr | ||
| LOAD t1, 0(t0) // Pickup current thread pointer | ||
| beqz t1, _tx_thread_idle_system_restore // If NULL, idle system restore | ||
|
|
||
|
|
||
| la t0, _tx_thread_preempt_disable // Pickup preempt disable flag address | ||
| lw t2, 0(t0) // Pickup preempt disable flag (UINT) | ||
|
|
||
| la t0, _tx_thread_preempt_disable | ||
| LOAD t2, 0(t0) // Pickup preempt disable flag | ||
| bgtz t2, _tx_thread_no_preempt_restore // If set, restore interrupted thread | ||
|
|
||
|
|
||
| la t0, _tx_thread_execute_ptr // Pickup thread execute pointer address | ||
| lw t2, 0(t0) // Pickup thread execute pointer | ||
|
|
||
| la t0, _tx_thread_execute_ptr | ||
| LOAD t2, 0(t0) // Pickup thread execute pointer |
There was a problem hiding this comment.
same here, No changes needed here.
| la x5, _tx_thread_system_state // Pickup address of system state | ||
| LOAD x6, 0(x5) // Pickup system state | ||
|
|
||
| /* Check for a nested interrupt. */ | ||
| /* Check for a nested interrupt condition. */ | ||
| /* if (_tx_thread_system_state++) | ||
| { */ | ||
| beqz x6, _tx_thread_not_nested_save // If 0, first interrupt condition | ||
| addi x6, x6, 1 // Increment the interrupt counter | ||
| STORE x6, 0(x5) // Store the interrupt counter | ||
|
|
||
| /* Nested interrupt condition. | ||
| Save the reset of the scratch registers on the stack and return to the | ||
| calling ISR. */ | ||
|
|
||
| la t0, _tx_thread_system_state // Pickup addr of system state var | ||
| lw t1, 0(t0) // Pickup system state | ||
| addi t1, t1, 1 // Increment system state | ||
| sw t1, 0(t0) // Store system state | ||
| li t0, 1 | ||
| bgt t1, t0, _tx_thread_nested_save // If it's more than 1, nested interrupt | ||
|
|
||
| /* First level interrupt, save the rest of the scratch registers and | ||
| check for a thread to preempt. */ | ||
|
|
||
| sw t2, 17*4(sp) // Store t2 | ||
| sw s0, 12*4(sp) // Store s0 | ||
| sw a0, 27*4(sp) // Store a0 | ||
| sw a1, 26*4(sp) // Store a1 | ||
| sw a2, 25*4(sp) // Store a2 | ||
| sw a3, 24*4(sp) // Store a3 | ||
| sw a4, 23*4(sp) // Store a4 | ||
| sw a5, 22*4(sp) // Store a5 | ||
| sw a6, 21*4(sp) // Store a6 | ||
| sw a7, 20*4(sp) // Store a7 | ||
| sw t3, 16*4(sp) // Store t3 | ||
| sw t4, 15*4(sp) // Store t4 | ||
| sw t5, 14*4(sp) // Store t5 | ||
| sw t6, 13*4(sp) // Store t6 | ||
|
|
||
| /* Save floating point registers. */ | ||
| #if defined(__riscv_float_abi_single) | ||
| fsw f0, 31*4(sp) // Store ft0 | ||
| fsw f1, 32*4(sp) // Store ft1 | ||
| fsw f2, 33*4(sp) // Store ft2 | ||
| fsw f3, 34*4(sp) // Store ft3 | ||
| fsw f4, 35*4(sp) // Store ft4 | ||
| fsw f5, 36*4(sp) // Store ft5 | ||
| fsw f6, 37*4(sp) // Store ft6 | ||
| fsw f7, 38*4(sp) // Store ft7 | ||
| fsw f10, 41*4(sp) // Store fa0 | ||
| fsw f11, 42*4(sp) // Store fa1 | ||
| fsw f12, 43*4(sp) // Store fa2 | ||
| fsw f13, 44*4(sp) // Store fa3 | ||
| fsw f14, 45*4(sp) // Store fa4 | ||
| fsw f15, 46*4(sp) // Store fa5 | ||
| fsw f16, 47*4(sp) // Store fa6 | ||
| fsw f17, 48*4(sp) // Store fa7 | ||
| fsw f28, 59*4(sp) // Store ft8 | ||
| fsw f29, 60*4(sp) // Store ft9 | ||
| fsw f30, 61*4(sp) // Store ft10 | ||
| fsw f31, 62*4(sp) // Store ft11 | ||
| STORE x7, TX_STACK_OFFSET_X7(sp) // Store t2 | ||
| STORE x8, TX_STACK_OFFSET_X8(sp) // Store s0 | ||
| STORE x10, TX_STACK_OFFSET_X10(sp) // Store a0 | ||
| STORE x11, TX_STACK_OFFSET_X11(sp) // Store a1 | ||
| STORE x12, TX_STACK_OFFSET_X12(sp) // Store a2 | ||
| STORE x13, TX_STACK_OFFSET_X13(sp) // Store a3 |
There was a problem hiding this comment.
same here, No changes needed here.
| /* Pickup current interrupt lockout posture. */ | ||
|
|
||
| /* Pickup current interrupt posture. */ | ||
| csrr t0, mstatus | ||
| mv t1, t0 // Save original mstatus for return | ||
|
|
||
| csrr a1, mstatus // Pickup mstatus | ||
| andi a1, a1, 0x08 // Mask out all but MIE | ||
|
|
||
| /* Check for the new posture. */ | ||
|
|
||
| beqz a0, _tx_thread_interrupt_disable // If 0, disable interrupts | ||
|
|
||
| /* Enable interrupts. */ | ||
|
|
||
| csrsi mstatus, 0x08 // Enable interrupts (MIE bit 3) | ||
| j _tx_thread_interrupt_control_exit // Return to caller | ||
|
|
||
| _tx_thread_interrupt_disable: | ||
|
|
||
| /* Disable interrupts. */ | ||
|
|
||
| csrci mstatus, 0x08 // Disable interrupts (MIE bit 3) | ||
|
|
||
| _tx_thread_interrupt_control_exit: | ||
|
|
||
| /* Return the old interrupt posture. */ | ||
|
|
||
| mv a0, a1 // Setup return value | ||
| ret // Return to caller | ||
| /* Apply the new interrupt posture. */ |
There was a problem hiding this comment.
missing the _tx_thread_interrupt_control_exit, _tx_thread_interrupt_disable symbols.
| import subprocess | ||
| import sys | ||
| import time | ||
| import os | ||
| import argparse | ||
| import socket | ||
| import select | ||
|
|
||
| def print_content(content): | ||
| """Prints content using os.write to handle non-blocking stdout robustly.""" | ||
| try: | ||
| msg = f"{content}\n".encode('utf-8') | ||
| total_len = len(msg) | ||
| written = 0 | ||
| fd = sys.stdout.fileno() | ||
| while written < total_len: | ||
| try: |
There was a problem hiding this comment.
not sure if this script should be added under ports/ or not.
| include(CPack) | ||
|
|
||
| # Enable the QEMU Virt demo application (Target: kernel.elf) | ||
| set(QEMU_DEMO_DIR ${CMAKE_CURRENT_LIST_DIR}/ports/risc-v32/gnu/example_build/qemu_virt) | ||
|
|
||
| add_executable(kernel.elf | ||
| ${QEMU_DEMO_DIR}/demo_threadx.c | ||
| ${QEMU_DEMO_DIR}/entry.s | ||
| ${QEMU_DEMO_DIR}/uart.c | ||
| ${QEMU_DEMO_DIR}/plic.c | ||
| ${QEMU_DEMO_DIR}/hwtimer.c | ||
| ${QEMU_DEMO_DIR}/trap.c | ||
| ${QEMU_DEMO_DIR}/board.c | ||
| ${QEMU_DEMO_DIR}/tx_initialize_low_level.S | ||
| ) | ||
|
|
There was a problem hiding this comment.
This is port/ level cmake, lets move this under ports/riscv32/ if possible.
This PR focuses on technical debt cleanup and functional enhancements for the ThreadX RISC-V 32-bit (RV32) port.
.Sfiles with symbolic macros intx_port.h._tx_thread_context_restoreto resolve potential race conditions in high-priority task preemption.wfi(Wait For Interrupt) instruction to optimize CPU idle cycles and power efficiency.__global_pointer$inlink.ldsto enable GP Relaxation.How to Run Tests
To verify the changes using the automated framework, please run the following commands:
cd build_qemu make check-functional-riscv32This is my first PR in Threadx. Please let me know if there’s anything I should improve or if I missed anything.