You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: queue retained MQTT publishes during disconnect, replay on reconnect
A user reported that on cold start, a subset of HA-discovered entities
sit at 'unavailable' indefinitely. Investigation confirmed cgateweb's
own startup path is the dominant culprit:
cgateWebBridge.start():
line 287 _updateBridgeReadiness('startup') -> publishNow
line 299 haBridgeDiagnostics.publishNow('startup')
line 301 _updateBridgeReadiness('startup-complete') -> publishNow
These three calls fire ~17 retained publishes each before MqttManager
has actually connected (the connect happens inside connectionManager
.start() which only awaits start, not readiness). With config-once
dedup that's ~35 publishes — matching the reporter's "38 dropped"
roll-up almost exactly. Plus any state events that race during the
MQTT settle window.
MqttManager.publish() returned false and incremented a counter when
disconnected, then logged "N publish(es) dropped while disconnected"
on reconnect — but the messages themselves were lost forever, so
HA had nothing to bind retained-state subscriptions to.
Add a bounded retain-aware queue:
- Map<topic, {payload, options}> — newest-wins per topic so a stale
level=0 is correctly overwritten by a fresh level=128 if both
queue during the same disconnect window.
- Bounded by mqttPendingPublishMaxEntries (default 1000); when full,
oldest entry is evicted and an evict count is warned on flush.
- Non-retained publishes still drop (one-shot events whose meaning
would be invalidated by replay).
- Flush on (re)connect with a single info-line replay summary.
- Errors mid-flush are logged per-topic but don't halt the rest.
8 new tests cover queueing, newest-wins, eviction, flush, mid-flush
error tolerance, and non-retain pass-through.
1217/1217 passing.
Bumps to 1.8.7.
Copy file name to clipboardExpand all lines: homeassistant-addon/CHANGELOG.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,6 +5,12 @@ All notable changes to the C-Gate Web Bridge Home Assistant add-on will be docum
5
5
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
8
+
## [1.8.7] - 2026-05-05
9
+
10
+
### Fixed
11
+
-**Startup-race silent publish drops**: HA Discovery configs and initial state values published before the MQTT broker was fully connected were silently dropped — `MqttManager.publish()` incremented a counter but never replayed the lost messages. Affected entities sat at `unavailable` in Home Assistant indefinitely (until C-Gate happened to emit a fresh event for that group while MQTT was up). `cgateweb`'s own startup path is the largest culprit: `cgateWebBridge.start()` calls `_updateBridgeReadiness('startup')` and `haBridgeDiagnostics.publishNow('startup')` before the broker connects, so ~38 retained publishes per restart go to /dev/null.
12
+
-`MqttManager` now keeps a bounded retain-aware queue of publishes attempted while disconnected. Map semantics give us newest-wins-per-topic so a stale `level=0` is correctly overwritten by a fresh `level=128` if both queue during the same disconnect window. The queue is bounded (default 1000 entries; configurable via `mqttPendingPublishMaxEntries`) and oldest entries are evicted with a warning if the broker stays unreachable. On (re)connect, the queue is flushed and the count is logged. Non-retained publishes (one-shot events whose meaning would be invalidated by replay) are still dropped — only retained state is queued.
0 commit comments