DroneCAN: Fix H7 FDCAN and F7 bxCAN driver configuration#11607
Draft
daijoubu wants to merge 47 commits into
Draft
DroneCAN: Fix H7 FDCAN and F7 bxCAN driver configuration#11607daijoubu wants to merge 47 commits into
daijoubu wants to merge 47 commits into
Conversation
|
Test firmware build ready — commit Download firmware for PR #11607 238 targets built. Find your board's
|
SJW=8 was overly conservative (80% of bit time at 1Mbps with 10 quanta). SJW=3 is the standard value also used by the F7 driver. Tested with 6037 arm/disarm cycles at 500kbps: TEC=0, REC=0, zero errors.
…DCAN PLL2Q was 3 (266 MHz, invalid for FDCAN ≤ 80 MHz). Fix to 10 (80 MHz). Extend PLL2 guard from USE_SDCARD_SDIO to USE_SDCARD_SDIO || USE_DRONECAN so H7 boards with CAN but no SD card get PLL2 configured. Adopt upstream PLL2M/N formula (VCI=1.6 MHz, VCO=800 MHz) and error check on HAL_RCCEx_PeriphCLKConfig.
…pport Remove redundant PeriphClkInitStruct clock config from canardSTM32CAN1_Init. system_stm32h7xx.c already configures FDCAN to use PLL2Q (80 MHz) when USE_DRONECAN is defined; duplicating it in the driver overwrites with PLL1. Also add CAN1 pin definitions and USE_DRONECAN to KAKUTEH7WING target (PD0/PD1, CAN1_STANDBY PD3 disabled by default).
Use HAL_RCCEx_GetPeriphCLKFreq(RCC_PERIPHCLK_FDCAN) instead of HAL_RCC_GetPCLK1Freq() for bit timing calculation. FDCAN is clocked from PLL2Q (80 MHz) configured in system_stm32h7xx.c; using PCLK1 (100 MHz) produced a ~25% baud rate error causing immediate bus-off. Restore SJW to 3 for better synchronisation tolerance.
Remove high-frequency LOG_DEBUG messages from GNSS Fix/Fix2/Auxiliary handlers, onTransferReceived, dronecanInit, and gps_dronecan HDOP path that fired at 25 Hz and flooded the log. Fix PLL2 VCO input to target 1.6 MHz (PLL2M = HSE/1600000, PLL2N = 500) rather than 2.0 MHz, keeping the operating point clearly within VCIRANGE_0 (1-2 MHz) as the original SDCARD-only code did with PLL2M=5. VCO output remains 800 MHz; FDCAN (80 MHz via PLL2Q=10) and SDMMC (200 MHz via PLL2R=4) outputs are unchanged.
Drop high-frequency and verbose-but-low-value LOG_DEBUG(CAN messages: - dronecan.c: Battery Info (x2), GetNodeInfo, NodeStatus, TX success, RX loop, commented-out debug blocks - canard_stm32h7xx_driver.c: timing computation intermediates (Baudrate, Max Quanta, Prescaler BS, Prescaler, Timings summary) - canard_stm32f7xx_driver.c: same timing intermediates, TX success, In CAN Init, commented-out clock and RX blocks Retain error-path messages (decode failed, TX/RX error, init failures) and the single-line Prescaler/SJW/BS summary logged at init.
… operation Tested at 1 Mbps on KAKUTEH7WING hardware and confirmed bus operational.
…ivers Remove CubeMX boilerplate markers, commented-out dead code, and development-time question comments from both drivers.
Fix: DroneCAN GNSS messages were being applied to gpsSolDRV regardless of the configured GPS provider. Guard added in gps_dronecan.c where it belongs, keeping CAN transport layer unaware of GPS config.
…g difference F7 bxCAN HAL writes SJW directly to BTR register where hardware adds 1, so stored value 3 gives 4 tq. This wider SJW is needed for reliable bus operation on F7 targets and is different from H7 where SJW=1 is actual tq.
Prevents state machine from continuing in INIT state when the CAN peripheral fails to initialize.
Prevents out-of-bounds access when STATE_DRONECAN_FAILED is active.
Prevents stale pre-bus-off frames from storming the bus on recovery.
With AutoRetransmission=ENABLE, frames that fail on a degraded bus occupy FIFO slots indefinitely. All 32 slots fill, HAL_FDCAN_AddMessage returns HAL_ERROR, and all outgoing traffic stalls permanently with no indication until full bus-off. DroneCAN reliability is handled at the application layer via periodic republishing.
Matches the H7 driver pattern. Previously the return value was silently discarded; if timing computation failed, uninitialized stack bytes were passed to HAL_CAN_Init.
The H7 FDCAN 128x11 recessive-bit recovery sequence takes up to 11.264ms at 125kbps. The 1ms delay was restarting the counter before it could complete, preventing the node from ever exiting bus-off. 20ms gives safe margin above worst-case and allows time to detect immediate re-entry.
Guard against non-DroneCAN GPS provider at the transport boundary (handle_GNSS* functions) rather than in each leaf function in gps_dronecan.c. Also adds the guard to handle_GNSSRCTMStream which had none. Removes stale UNUSED(pgnssAux) and placeholder comment from dronecanGPSReceiveGNSSAuxiliary.
canardSTM32GetProtocolStatus() was called on every dronecanUpdate() invocation (~500Hz) to detect bus-off. Moved into the existing 1Hz task block — bus-off detection latency of up to 1s is acceptable. Adds LOG_DEBUG to report BusOff and ErrorPassive flags each second for bench diagnostics.
AutoBusOff=ENABLE handles the 128x11 recovery sequence automatically, but ESR.BOFF is a sticky read-only flag that is NOT cleared when hardware recovery completes. GetProtocolStatus() reads this flag, so the state machine was permanently stuck in STATE_DRONECAN_BUS_OFF after any bus-off event on F7 targets. Stop/Start re-enters init mode which clears ESR.BOFF, allowing recovery detection to work correctly.
Comment incorrectly stated '25MHz' as a supported HSE value — 25MHz fails the assert. CMake always provides HSE_VALUE per-target via -DHSE_VALUE=<n> so the stm32h7xx_hal_conf.h fallback of 25MHz is never used. Current targets use 8MHz (default) or 16MHz (KAKUTEH7WING).
…adence GetProtocolStatus() was called every dronecanUpdate() cycle (~500Hz) in BUS_OFF state. Moved inside the 20ms recovery timer block so it runs at the same cadence as RecoverFromBusOff() — still detects recovery within 20ms but reduces MMIO reads from ~500/sec to ~50/sec.
HAL_CAN_Stop/Start called from the scheduler context with CAN interrupts active caused a full FC lockup. Reverted to empty stub pending investigation of a safe mechanism to clear the sticky ESR.BOFF flag on F7.
Unconditional 1Hz LOG_DEBUG was flooding the bootlog with healthy status messages. Now only logs when an error condition is actually present.
HAL_CAN_AddTxMessage returns non-OK when all mailboxes are busy — a normal transient condition at startup. The log was noise. Matches the H7 driver which already handles this path silently.
DroneCAN float16 optional fields encode NaN when unpopulated. Without a guard, NaN * 100 converts to 0 on Cortex-M (ARM VCVT saturation), permanently blocking the HDOP fallback path. Also passes values through gpsConstrainHDOP() to prevent uint16_t overflow for extreme DOP values.
…ompatible gpsSolDRV has hdop but no vdop field. VDOP and EPV are not interchangeable (different units, conversion requires receiver UERE). lastVDOP was a dead store with no valid consumer.
…ames size Guards the CLI state name array against future enum additions — if a new state is added without updating the array, the build fails immediately.
Guards against out-of-bounds read if dronecanState is ever corrupted to STATE_DRONECAN_COUNT or beyond.
ABOM (CAN_MCR bit 6) is enabled in canardSTM32CAN1_Init. Per RM0410 ss40.7.6, with ABOM=1 the hardware manages the full bus-off recovery sequence automatically: after 128x11 recessive bits it cycles INRQ and clears ESR.BOFF without any software intervention required. Removes an incorrect TODO asserting ESR.BOFF is sticky — it is a hardware-managed status bit that clears when bus-off state is left.
Cast both MIN() arguments to int to avoid -Werror=sign-compare between dronecanState_e (unsigned enum) and STATE_DRONECAN_COUNT - 1 (int).
clang treats uint32_t as unsigned int, not unsigned long. Use %u to match the actual type of BusOff and ErrorPassive in canardProtocolStatus_t.
…NODES Previously last_seen_ms was the raw millis() timestamp when the node was last heard from, which equals FC uptime for active nodes. Configurators had no way to compute elapsed time without knowing current FC millis(). Now sends millis() - last_seen_ms so the field means "ms since this node was last heard from". Unsigned subtraction handles millis() wraparound correctly.
…NODE_INFO Apply the same fix as MSP2_INAV_DRONECAN_NODES: send millis() - last_seen_ms so the field means elapsed time since last heard from the node.
Switch FDCAN from FIFO to Queue mode so hardware transmits highest-priority frame first, matching libcanard's scheduler. Reduce staging depth 32→3 so libcanard's queue remains authoritative between poll cycles. Add HAL_FDCAN_TxBufferCompleteCallback ISR pump to refill the hardware slot immediately on TX complete. Wrap all canardBroadcast, canardRequestOrRespond, and canardCleanupStaleTransfers call sites with NVIC_DisableIRQ/EnableIRQ on FDCAN1_IT1 to prevent races between the ISR and main task on libcanard's queue. Masking helpers are no-ops on non-H7 targets.
d000077 to
1532585
Compare
…onecan Phase 2 of the combined DroneCAN driver rework (folds in iNavFlight#11560 content): F7 bxCAN ISR-driven TX refill: - Add NVIC_PRIO_CAN=4 to nvic.h (shared by H7 and F7) - Enable CAN_IT_TX_MAILBOX_EMPTY and CAN1_TX_IRQn at NVIC_PRIO_CAN - Add HAL_CAN_TxMailbox{0,1,2}CompleteCallback ISR pumps in dronecan.c - Wire dronecanMaskTxISR/UnmaskTxISR to NVIC_DisableIRQ/EnableIRQ(CAN1_TX_IRQn) - Add #else branch for SITL/AT32 (was missing, would break SITL build) TEC/REC/LEC error counters: - Extend canardProtocolStatus_t with tec, rec, lec fields - F7: populate from ESR register (bits 23:16, 31:24, 6:4) - H7: populate from ECR register (bits 22:16, 14:8) + PSR.LastErrorCode Typo fix: canardSTM32Recieve -> canardSTM32Receive across all drivers and call sites Other: - Add canardSTM32GetTxQueueFillLevel() to all three drivers (returns 0; no SW queue) - Make canard and memory_pool static in dronecan.c - Add cliDronecan CLI command showing bus health (BusOff, ErrorPassive, TEC, REC, LEC, fill levels)
Fix H7 ISR pump (was completely non-functional): - FDCAN_IT_TX_COMPLETE routes to LINE0 by default (ILS=0); enable FDCAN1_IT0_IRQn at NVIC_PRIO_CAN and add FDCAN1_IT0_IRQHandler - Change dronecanMaskTxISR from FDCAN1_IT1_IRQn to FDCAN1_IT0_IRQn Fix H7 TEC extraction from ECR register: - ECR[7:0]=TEC, ECR[14:8]=REC, ECR[23:16]=CEL (not TEC) - Previous code read CEL and labelled it tec Narrow ISR mask window in processCanardTxQueueSafe: - Mask only covers canardPeekTxQueue/canardPopTxQueue (linked-list ops) - HAL transmit call now runs with ISR unmasked - Re-peek after transmit to detect if ISR already popped the frame F7 RxBuffer ISR safety: - writeIndex/readIndex now volatile to prevent compiler caching - Remove unused file-scope rxMsg variable - rxBufferPushFrame/PopFrame now int8_t and static - Log warning on RX buffer full instead of silently dropping Remove redundant lec & 0x7 mask in cliDronecan (already masked at population)
Move public canardSTM32* functions above static helpers in H7 and F7 drivers. Move dronecanInit/Update/Get* to the top of dronecan.c with forward declarations for internal callbacks. Mark rxBufferNumMessages static in F7 driver. Add ISR-context warning comment to processCanardTxQueue.
- processCanardTxQueueSafe: simplify by holding NVIC mask across transmit call, eliminating the pointer-identity ABA assumption - F7 rxBuffer: add __DMB() barriers between data write and index advance to satisfy C99 memory ordering between ISR and main loop - F7 ComputeTimings: align max_quanta_per_bit with H7 (17, not 18) per Koppe reference recommendation - H7 GetProtocolStatus: document why ECR is read directly vs HAL - H7 RecoverFromBusOff: clarify CCCR.INIT clear is defensive no-op - dronecanUpdate: replace dead STATE_DRONECAN_COUNT case with default - cli: use PRId32 for int32_t format specifiers; add <inttypes.h> - SITL: remove Doxygen blocks from private static helpers
- Make processCanardTxQueue, shouldAcceptTransfer, onTransferReceived static - Replace ISR LOG calls with volatile counters (txErrCount, rxDropCount); log and clear at 1Hz from main loop via canardSTM32GetAndClearRxDropCount() - Add canardSTM32GetAndClearRxDropCount() to driver interface; F7 implements ring-buffer drop counter, H7/SITL return 0 (no SW ring buffer) - Check canardBroadcast/canardRequestOrRespond return values; log on OOM - H7: set RxBuffersNbr=0 (dedicated buffer was unused, wasted message RAM) - H7: reject unmatched standard-ID frames (FDCAN_REJECT for NonMatchingStd) - F7: move NVIC_EnableIRQ calls after all ActivateNotification calls succeed - Fix F7 bxCAN Doxygen: correct @brief, remove wrong @param, fix @RetVal - Fix all @RetVal docs: ret==0 is OK (CANARD_OK), not ret==1 - Cast pid_t to uint32_t before bit-shifting in SITL unique ID generation
- Wrap processCanardTxQueue in #if STM32H7||STM32F7 — function is ISR-only and has no callers in SITL builds; static + no callers caused -Werror=unused-function on CI - Atomic read-clear of txErrCount: hold dronecanMaskTxISR across snapshot+zero to prevent ISR increment being silently dropped - Atomic read-clear of rxDropCount: disable CAN1_RX0_IRQn across snapshot+zero in canardSTM32GetAndClearRxDropCount (F7) - Make all eight file-local handler functions static: send_NodeStatus, handle_NodeStatus, handle_GNSSAuxiliary, handle_GNSSFix, handle_GNSSFix2, handle_GNSSRCTMStream, handle_BatteryInfo, handle_GetNodeInfo - H7 global filter: FDCAN_REJECT_REMOTE for both RTR frame params (was FDCAN_FILTER_REMOTE which passed RTR frames to normal filter) - H7 Receive: add comment that DataLength==byte count holds only in FDCAN_FRAME_CLASSIC mode (FDCAN_DLC_BYTES_0..8 equal 0..8) - F7 ring buffer: use local index snapshots in push/pop to avoid double volatile re-reads of writeIndex/readIndex
- H7 Transmit: add FDCAN_FRAME_CLASSIC comment on DataLength assignment matching the comment already on the receive path - F7 init: add comment explaining pre-shifted timing register values - SITL: map RTR flag in sitlCANFrameToLinux/FromLinux, consistent with hardware drivers - SITL canardSTM32Transmit: add ERR frame guard matching hardware drivers - handle_GNSSRCTMStream: remove dead decode, add comment that RTCM forwarding is not yet implemented - dronecanInit: add default case to bitrate switch for EEPROM corruption - vendor_specific_status_code: explicit (uint16_t) cast with comment acknowledging bits 16-30 of armingFlags are not transmitted - fport.c: remove dead static volatile frameErrors counter (written but never read; triggered -Werror=unused-but-set-variable on GCC 16)
- H7 Receive: add bounds guard on DataLength before assignment to data_len (uint8_t); mirrors existing TX path guard; safe no-op in FDCAN_FRAME_CLASSIC mode but prevents silent truncation if DLC ever exceeds 8 - F7: fix stale comment hfdcan->ErrorCode -> hcan1.ErrorCode - dronecanInit: fix mixed tab/space indentation in canardInit() call - dronecanGetBitrateKbps: return 500 for DRONECAN_BITRATE_COUNT and default cases, matching what dronecanInit actually selects - Fix typo: incremeneted -> incremented - SITL GetRxFifoFillLevel: add comment explaining FIONREAD on SOCK_RAW returns next-datagram size only, so result is 0 or 1
FDCAN_FILTER_DUAL with FilterID1=0x0, FilterID2=0x1FFFFFFF only matched those two exact IDs. Replace with FDCAN_FILTER_MASK (pattern=0, mask=0) which accepts any extended ID. Also fix stale comments in H7 driver and dronecanInit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes configuration errors and robustness issues in the H7 FDCAN and F7 bxCAN DroneCAN drivers. Migrates F7 TX from polling to ISR-driven, refilling the 3 hardware TX mailboxes from the libcanard queue in the TX-complete callback. Hardware verified on KAKUTEH7WING (H7) and MATEKF765SE (F7).
Supersedes #11560.
Changes
H7 FDCAN driver (
canard_stm32h7xx_driver.c)HAL_RCC_GetPCLK1Freq()(APB1); now correctly usesHAL_RCCEx_GetPeriphCLKFreq(RCC_PERIPHCLK_FDCAN)TxBuffersNbr = 0andRxBuffersNbr = 0— driver uses FIFO queue only; dedicated buffers unusedTXBCR) before clearingCCCR.INITon bus-off recoveryFDCAN_FILTER_DUALonly matched two exact IDs (0x0 and 0x1FFFFFFF); replaced withFDCAN_FILTER_MASK(pattern=0, mask=0) to accept all extended IDsNonMatchingStdwasFDCAN_ACCEPT_IN_RX_FIFO0; nowFDCAN_REJECT— standard frames are not used by DroneCANFDCAN_FILTER_REMOTE; nowFDCAN_REJECT_REMOTERxHeader.DataLengthbeforememcpy;FDCAN_DLC_BYTES_0..8 == 0..8holds only inFDCAN_FRAME_CLASSICmodecanardSTM32GetAndClearRxDropCount()stub (FIFO0 has 30 hardware slots; no software ring buffer)@retvaldoc commentsPLL2 clock configuration (
system_stm32h7xx.c)USE_DRONECAN(wasUSE_SDCARD_SDIOonly)STATIC_ASSERTfor HSE_VALUE divisibility by PLL2MHAL_RCCEx_PeriphCLKConfigreturn valueF7 bxCAN driver (
canard_stm32f7xx_driver.c)HAL_CAN_TxMailbox{0,1,2}CompleteCallback; ISR refills the 3 hardware TX mailboxes from the libcanard queue__DMB()barriers for Cortex-M7 correctnesscanardSTM32GetAndClearRxDropCount(): IRQ-masked atomic read-clear ofrxDropCount; ISR increments on RX FIFO overflow instead of loggingNVIC_EnableIRQcalls after allActivateNotificationcalls succeedmax_quanta_per_bit: was 18, reduced to 17 to match CiA paper maximum for ≤ 500 kbps@param hfdcanhfdcan->ErrorCodereference →hcan1.ErrorCodeDroneCAN state machine (
dronecan.c)staticstatic volatile uint32_t txErrCount; ISR increments, 1 Hz loop reads+clears with NVIC mask held across snapshotcanardBroadcastandcanardRequestOrRespond; log on OOMvendor_specific_status_code:(uint16_t)(armingFlags & 0xFFFF)with explicit truncationhandle_GNSSRCTMStream: replace dead decode withUNUSED(transfer)stubdefault:case todronecanInitbitrate switch with error log and 500 kbps fallbackSTATE_DRONECAN_FAILEDset on peripheral init failureSTATE_DRONECAN_COUNTsentinel withSTATIC_ASSERTprotecting CLI name arrayGPS DroneCAN driver (
gps_dronecan.c)gpsSolDRVregardless of configured providergpsConstrainHDOP()to prevent uint16_t overflowlastVDOPdead storeSITL driver (
canard_sitl_driver.c)canardSTM32GetAndClearRxDropCount()stubsitlCANFrameToLinux/sitlCANFrameFromLinuxcanardSTM32Transmitpid_ttouint32_tbefore bit-shifting for unique IDFIONREADcomment explaining 0-or-1 frame-per-tick behaviourDriver header (
canard_stm32_driver.h)uint32_t canardSTM32GetAndClearRxDropCount(void)declarationCLI (
cli.c)PRId32format specifiers for fixed-width integers; add#include <inttypes.h>KAKUTEH7WING target
USE_DRONECANwith CAN1 RX/TX pin definitions (PD0/PD1)Testing
Pending: HITL validation, overnight stability run on H7 and F7 variants, real-airframe flight test.