Skip to content

Commit 4838290

Browse files
committed
Update docs and esp32s3 compiler optimization
1 parent 50e15fe commit 4838290

9 files changed

Lines changed: 106 additions & 81 deletions

File tree

readme.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,12 @@ The Espressif (`target xtensa_esp32_s3`) port for NodeMCU ESP32-S3
402402
features a bare-metal startup _without_ using any of the SDK.
403403
The bare-metal startup was taken from the work of
404404
[Chalandi/Baremetal_esp32s3_nosdk](https://github.com/Chalandi/Baremetal_esp32s3_nosdk).
405+
The dual-core system first boots core0 which subsequently
406+
starts up core1. Blinky runs in the standard `ref_app`
407+
on core0 toggling `port7` while an endless timer loop on core1
408+
toggles `port6`. The LED ports togle in near unison at $\frac{1}{2}~\text{Hz}$.
409+
Self-procured LEDs and resistors need to be fitted in order to observe
410+
blinky on this particular board.
405411

406412
The NXP(R) OM13093 LPC11C24 board ARM(R) Cortex(R)-M0+ configuration
407413
called `target lpc11c24` toggles the LED on `port0.8`.
@@ -460,9 +466,9 @@ The program toggles the GPIO status LED at GPIO index `0x47`.
460466
The `rpi_pico_rp2040` target configuration employs the
461467
RaspberryPi(R) Pico RP2040 with dual-core ARM(R) Cortex(R)-M0+
462468
clocked at $133~\text{MHz}$. The low-level startup boots through
463-
core 0. Core 0 then starts up core 1 (via a specific protocol).
464-
Core 1 subsequently carries out the blinky application,
465-
while core 0 enters an endless, idle loop.
469+
core0. Core0 then starts up core1 (via a specific protocol).
470+
Core1 subsequently carries out the blinky application,
471+
while core0 enters an endless, idle loop.
466472
Ozone debug files are supplied for this system for those interested.
467473
Reverse engineering of the complicated (and scantly documented)
468474
dual-core startup originated in and have been taken from (with many thanks)

ref_app/src/app/benchmark/readme.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -108,14 +108,14 @@ The $32$-bit RISC-V controller (having a novel _open-source_ core)
108108
on the `wch_ch32v307` board boasts a quite respectable
109109
time of $8.0~\text{ms}$.
110110

111-
Running on only one core (core 0) of the $32$-bit
111+
Running on only one core (core0) of the $32$-bit
112112
controller of the `xtensa_esp32_s3` board results in
113113
a runtime of $9.1~\text{ms}$ for the calculation.
114114

115-
Using only one core (core 1) on the $32$-bit ARM(R) Cortex(R) M0+
115+
Using only one core (core1) on the $32$-bit ARM(R) Cortex(R) M0+
116116
controller of the `rpi_pico_rp2040` board results in a calculation
117117
time of $19~\text{ms}$. The next generation `rpi_pico2_rp2350`
118118
with dual ARM(R) Cortex(R) M33 cores definitively improves on this
119-
(still using only core 1) with a time of $6.3~\text{ms}$.
119+
(still using only core1) with a time of $6.3~\text{ms}$.
120120
This is slightly more than $3~\text{ms}$ times faster
121121
than its predecessor.

ref_app/src/mcal/rpi_pico2_rp2350/mcal_cpu_rp2350.cpp

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -106,26 +106,26 @@ auto mcal::cpu::rp2350::start_core1() -> bool
106106
mcal::reg::sio_fifo_st,
107107
UINT32_C(0)>::bit_get());
108108

109-
// Send 0 to wake up core 1.
109+
// Send 0 to wake up core1.
110110
local::sio_fifo_write_verify(std::uint32_t { UINT32_C(0) });
111111

112-
// Send 1 to synchronize with core 1.
112+
// Send 1 to synchronize with core1.
113113
local::sio_fifo_write_verify(std::uint32_t { UINT32_C(1) });
114114

115115
static_assert(sizeof(std::uint32_t) == sizeof(std::uintptr_t), "Error: Pointer/address size mismatch");
116116

117-
// Send the VTOR address for core 1.
117+
// Send the VTOR address for core1.
118118
local::sio_fifo_write_verify(reinterpret_cast<std::uint32_t>(&__INTVECT_Core1[0U]));
119119

120-
// Send the stack pointer value for core 1.
120+
// Send the stack pointer value for core1.
121121
local::sio_fifo_write_verify(__INTVECT_Core1[0U]);
122122

123-
// Send the reset handler address for core 1.
123+
// Send the reset handler address for core1.
124124
local::sio_fifo_write_verify(__INTVECT_Core1[1U]);
125125

126-
// Clear the sticky bits of the FIFO_ST on core 0.
127-
// Note: Core 0 has called us to get here so these are,
128-
// in fact, the FIFO_ST sticky bits on core 0.
126+
// Clear the sticky bits of the FIFO_ST on core0.
127+
// Note: core0 has called us to get here so these are,
128+
// in fact, the FIFO_ST sticky bits on core0.
129129

130130
// HW_PER_SIO->FIFO_ST.reg = 0xFFu;
131131
mcal::reg::reg_access_static<std::uint32_t,

ref_app/src/mcal/rpi_pico_rp2040/mcal_cpu_rp2040.cpp

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -94,26 +94,26 @@ auto mcal::cpu::rp2040::start_core1() -> bool
9494
mcal::reg::sio_fifo_st,
9595
UINT32_C(0)>::bit_get());
9696

97-
// Send 0 to wake up core 1.
97+
// Send 0 to wake up core1.
9898
local::sio_fifo_write_verify(std::uint32_t { UINT32_C(0) });
9999

100-
// Send 1 to synchronize with core 1.
100+
// Send 1 to synchronize with core1.
101101
local::sio_fifo_write_verify(std::uint32_t { UINT32_C(1) });
102102

103103
static_assert(sizeof(std::uint32_t) == sizeof(std::uintptr_t), "Error: Pointer/address size mismatch");
104104

105-
// Send the VTOR address for core 1.
105+
// Send the VTOR address for core1.
106106
local::sio_fifo_write_verify(reinterpret_cast<std::uint32_t>(&__INTVECT_Core1[0U]));
107107

108-
// Send the stack pointer value for core 1.
108+
// Send the stack pointer value for core1.
109109
local::sio_fifo_write_verify(__INTVECT_Core1[0U]);
110110

111-
// Send the reset handler address for core 1.
111+
// Send the reset handler address for core1.
112112
local::sio_fifo_write_verify(__INTVECT_Core1[1U]);
113113

114-
// Clear the sticky bits of the FIFO_ST on core 0.
115-
// Note: Core 0 has called us to get here so these are,
116-
// in fact, the FIFO_ST sticky bits on core 0.
114+
// Clear the sticky bits of the FIFO_ST on core0.
115+
// Note: core0 has called us to get here so these are,
116+
// in fact, the FIFO_ST sticky bits on core0.
117117

118118
// SIO->FIFO_ST.reg = 0xFFU;
119119
mcal::reg::reg_access_static<std::uint32_t,

ref_app/src/mcal/xtensa_esp32_s3/mcal_cpu.cpp

Lines changed: 31 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,9 @@ extern "C"
2626
extern "C"
2727
void Mcu_StartCore1()
2828
{
29-
// Unstall core 1.
29+
// Note: This subroutine is called from core0.
30+
31+
// Firstly we need to unstall core1.
3032

3133
// RTC_CNTL->OPTIONS0.bit.SW_STALL_APPCPU_C0 = 0;
3234
// RTC_CNTL->SW_CPU_STALL.bit.SW_STALL_APPCPU_C1 = 0;
@@ -43,41 +45,52 @@ void Mcu_StartCore1()
4345

4446
mcal::reg::reg_access_static<std::uint32_t, std::uint32_t, mcal::reg::system::core_1_control_0, static_cast<std::uint32_t>(UINT8_C(0))>::bit_clr();
4547

46-
// Enable the clock for core 1.
48+
// Enable the clock for core1.
4749

4850
// SYSTEM->CORE_1_CONTROL_0.bit.CONTROL_CORE_1_CLKGATE_EN = 1;
4951
mcal::reg::reg_access_static<std::uint32_t, std::uint32_t, mcal::reg::system::core_1_control_0, static_cast<std::uint32_t>(UINT8_C(1))>::bit_set();
5052

51-
// Reset core 1.
53+
// Reset core1.
5254

5355
// SYSTEM->CORE_1_CONTROL_0.bit.CONTROL_CORE_1_RESETING = 1;
5456
// SYSTEM->CORE_1_CONTROL_0.bit.CONTROL_CORE_1_RESETING = 0;
5557
mcal::reg::reg_access_static<std::uint32_t, std::uint32_t, mcal::reg::system::core_1_control_0, static_cast<std::uint32_t>(UINT8_C(2))>::bit_set();
5658
mcal::reg::reg_access_static<std::uint32_t, std::uint32_t, mcal::reg::system::core_1_control_0, static_cast<std::uint32_t>(UINT8_C(2))>::bit_clr();
5759

58-
// Note: In ESP32-S3, when the reset of the core1 is released,
59-
// the core1 starts executing the bootROM code and it gets stuck
60-
// in a trap waiting for the entry address to be received
61-
// from core0. This is can be achieved by writing the core1 entry
62-
// address to the register SYSTEM_CORE_1_CONTROL_1_REG from core0.
60+
// Note: In ESP32-S3 when the reset of core1 is released,
61+
// then core1 starts executing the bootROM code. Core1
62+
// subsequently gets stuck in a trap. It is waiting for
63+
// the entry address to be received from core0.
6364

64-
// Set the core1 entry address.
65+
// The send/receive transaction of the entry address is
66+
// carried out via core0 deliberately writing the core1
67+
// entry address in the SYSTEM_CORE_1_CONTROL_1_REG register.
6568

66-
// SYSTEM->CORE_1_CONTROL_1.reg = (uint32_t) &_start;
6769
{
68-
const std::uint32_t start_addr { reinterpret_cast<std::uint32_t>(&_start) };
70+
// Set the core1 entry address.
71+
72+
using mcal_reg_access_dynamic_type = mcal::reg::reg_access_dynamic<std::uint32_t, std::uint32_t>;
73+
74+
// SYSTEM->CORE_1_CONTROL_1.reg = (uint32_t) &_start;
6975

70-
mcal::reg::reg_access_dynamic<std::uint32_t, std::uint32_t>::reg_set(mcal::reg::system::core_1_control_1, start_addr);
76+
mcal_reg_access_dynamic_type::reg_set
77+
(
78+
mcal::reg::system::core_1_control_1,
79+
static_cast<std::uint32_t>(reinterpret_cast<std::uintptr_t>(&_start))
80+
);
7181
}
7282
}
7383

7484
extern "C"
7585
void main_c1()
7686
{
77-
// Set the private cpu timer1 for core 1.
87+
// Note: This subroutine executes in core1. It has been called
88+
// by the core1 branch of the subroutine _start().
89+
90+
// Set the private cpu timer1 for core1.
7891
set_cpu_private_timer1(mcal::gpt::timer1_reload());
7992

80-
// Enable all interrupts on core 1.
93+
// Enable all interrupts on core1.
8194
mcal::irq::init(nullptr);
8295

8396
// GPIO->OUT.reg |= CORE1_LED;
@@ -88,10 +101,12 @@ void main_c1()
88101

89102
auto mcal::cpu::post_init() noexcept -> void
90103
{
91-
// Set the private cpu timer1 for core 0.
104+
// Note: This subroutine is called from core0.
105+
106+
// Set the private cpu timer1 for core0.
92107
set_cpu_private_timer1(mcal::gpt::timer1_reload());
93108

94-
// Use core 0 to start core 1.
109+
// Use core0 to start core1.
95110
Mcu_StartCore1();
96111
}
97112

ref_app/target/micros/rpi_pico2_rp2350/startup/crt0.cpp

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,9 @@ auto __my_startup() -> void
6868
mcal::wdg::secure::trigger();
6969

7070
// Jump to __main, which calls __main_core0, the main
71-
// function of core 0. The main function of core 0
72-
// itself then subsequently starts up core 1 which
73-
// is launched in __main_core1. Both of these core 0/1
71+
// function of core0. The main function of core0
72+
// itself then subsequently starts up core1 which
73+
// is launched in __main_core1. Both of these core0/1
7474
// subroutines will never return.
7575

7676
::__main();
@@ -86,17 +86,17 @@ auto __my_startup() -> void
8686
extern "C"
8787
auto __main() -> void
8888
{
89-
// Run the main function of core 0.
90-
// This will subsequently start core 1.
89+
// Run the main function of core0.
90+
// This will subsequently start core1.
9191
::__main_core0();
9292

93-
// Synchronize with core 1.
93+
// Synchronize with core1.
9494
mcal::cpu::rp2350::multicore_sync(local::get_cpuid());
9595

9696
// It is here that an actual application could
97-
// be started and then executed on core 0.
97+
// be started and then executed on core0.
9898

99-
// Execute an endless loop on core 0 (while the application runs on core 1).
99+
// Execute an endless loop on core0 (while the application runs on core1).
100100
for(;;) { mcal::cpu::nop(); }
101101

102102
// This point is never reached.
@@ -105,45 +105,45 @@ auto __main() -> void
105105
extern "C"
106106
auto __main_core0() -> void
107107
{
108-
// Disable interrupts on core 0.
108+
// Disable interrupts on core0.
109109
mcal::irq::disable_all();
110110

111-
// Start core 1 and verify successful initiaization of core 1.
111+
// Start core1 and verify successful initiaization of core1.
112112
if(!mcal::cpu::rp2350::start_core1())
113113
{
114-
// In case of error, loop forever (on core 0).
114+
// In case of error, loop forever (on core0).
115115
for(;;)
116116
{
117117
// Replace with a loud error if desired.
118118
mcal::wdg::secure::trigger();
119119
}
120120
}
121121

122-
// This flag will be set by core 1 (which is now running).
122+
// This flag will be set by core1 (which is now running).
123123
while(!core_1_run_flag_get())
124124
{
125125
mcal::cpu::nop();
126126
}
127127

128-
// This subroutine (running on core 0) *does* return
128+
// This subroutine (running on core0) *does* return
129129
// at this point here.
130130
}
131131

132132
extern "C"
133133
auto __main_core1() -> void
134134
{
135-
// Disable interrupts on core 1.
135+
// Disable interrupts on core1.
136136
mcal::irq::disable_all();
137137

138138
core_1_run_flag_set(true);
139139

140-
// Core 1 is started via interrupt enabled by the BootRom.
141-
// But core 1 remains in an interrupt handler until core 0
142-
// actually manually starts core 1 in the subroutine
143-
// mcal::cpu::rp2040::start_core1(). Execution on core 1
140+
// Core1 is started via interrupt enabled by the BootRom.
141+
// But core1 remains in an interrupt handler until core0
142+
// actually manually starts core1 in the subroutine
143+
// mcal::cpu::rp2040::start_core1(). Execution on core1
144144
// begins here.
145145

146-
// Clear the sticky bits of the FIFO_ST on core 1.
146+
// Clear the sticky bits of the FIFO_ST on core1.
147147
// HW_PER_SIO->FIFO_ST.reg = 0xFFu;
148148
mcal::reg::reg_access_static<std::uint32_t,
149149
std::uint32_t,
@@ -162,18 +162,18 @@ auto __main_core1() -> void
162162

163163
asm volatile("dsb");
164164

165-
// Clear all pending interrupts on core 1.
165+
// Clear all pending interrupts on core1.
166166

167167
// NVIC->ICPR[0U] = static_cast<std::uint32_t>(UINT32_C(0xFFFFFFFF));
168168
mcal::reg::reg_access_static<std::uint32_t,
169169
std::uint32_t,
170170
mcal::reg::nvic_icpr,
171171
std::uint32_t { UINT32_C(0xFFFFFFFF) }>::reg_set();
172172

173-
// Synchronize with core 0.
173+
// Synchronize with core0.
174174
mcal::cpu::rp2350::multicore_sync(local::get_cpuid());
175175

176-
// Enable the hardware FPU on Core 1.
176+
// Enable the hardware FPU on core1.
177177

178178
mcal::reg::reg_access_static<std::uint32_t,
179179
std::uint32_t,
@@ -185,7 +185,7 @@ auto __main_core1() -> void
185185
mcal::reg::ppb_cpacr,
186186
std::uint32_t { (3UL << 20U) | (3UL << 22U) }>::reg_or();
187187

188-
// Jump to main on core 1 (and never return).
188+
// Jump to main on core1 (and never return).
189189
asm volatile("ldr r3, =main");
190190
asm volatile("blx r3");
191191
}

0 commit comments

Comments
 (0)