Skip to content

fix(ovms_server_v3): don't check for m.net.good.sq as a prerequisite for connecting to MQTT broker#1151

Open
zhongfu wants to merge 1 commit intoopenvehicles:masterfrom
zhongfu:fix/server-v3-no-m-net-good-sq-check
Open

fix(ovms_server_v3): don't check for m.net.good.sq as a prerequisite for connecting to MQTT broker#1151
zhongfu wants to merge 1 commit intoopenvehicles:masterfrom
zhongfu:fix/server-v3-no-m-net-good-sq-check

Conversation

@zhongfu
Copy link
Copy Markdown
Contributor

@zhongfu zhongfu commented May 17, 2025

prevents users from accidentally locking themselves out, if they can only connect via MQTT

(e.g. no Server v2/WiFi/etc access, and they cannot access the module physically)

fixes #1150 (maybe?), but I'm not sure if there's any other reason we might want to keep this check

…for connecting

prevents users from accidentally locking themselves out, if they can only connect via MQTT

(e.g. no Server v2/WiFi/etc access, and they cannot access the module physically)
@dexterbg
Copy link
Copy Markdown
Member

dexterbg commented May 18, 2025

@Jamidd can you please comment on this / discuss this with zhongfu, as you're the author of the signal quality check.

Originating pull request: #1046

@CrashOverride2
Copy link
Copy Markdown
Contributor

CrashOverride2 commented Jul 29, 2025

@zhongfu, without delving too deep into the connection logic, could this issue be related to the "Waiting for network connectivity..." message when connecting to server v3? I’ve noticed a connection delay during module reboots or cold boots, which isn’t an issue for me. However, every two to three weeks, the v3 connection gets stuck in this state. During this time, the v2 connection also experiences significant delays. The signal quality (SQ) often drops to around -93 dBm and doesn’t recover. If I attempt to restart the network, the module crashes.

v3_waiting

I haven’t investigated the connection further, but when both connections drop, I typically restart the module with the Wi-Fi AP, which I haven’t disabled for this purpose, even though it’s generally recommended to do so, because the modules are not so easy accessible anymore. I have two official V3.3 modules, and they both behave the same way. Since the cars are usually in the same location, it could be an issue with network coverage rather than a software fault. But....Strangely, when I restart the module in the same location, the signal quality improves, and both v2 and v3 connections work seamlessly.

v3_recover

However, I agree with you that the low SQ does indicate no v3 connectivity. I think if the module is allowed to continue attempting to connect, it may crash due to the blocking task trying to transmit data.

@zhongfu
Copy link
Copy Markdown
Contributor Author

zhongfu commented Jul 30, 2025

@CrashOverride2

could this issue be related to the "Waiting for network connectivity..." message when connecting to server v3

yes, it's the direct cause of it (m.net.good.sq is "no", because your signal quality is below the min threshold), but I don't think that's the root cause

if you get a significantly improved signal strength after a module reboot, but without moving the car or module... perhaps there's something wrong with WiFi? or that the module wasn't connecting to WiFi prior to the reboot, and that's actually the LTE/GSM signal strength

I think if the module is allowed to continue attempting to connect, it may crash due to the blocking task trying to transmit data.

yeah, after using OVMS for a bit more, I realize that most of my crashes are apparently caused by network latency

this seems to affect TLS connections disproportionately, so I've disabled TLS for v2, since that already has... some form of encryption. TLS for v3 stays on since it would be completely plaintext otherwise

so now, my module basically only crashes when roaming off WiFi onto cellular, and the connection stays rock solid otherwise

with that in mind, it probably kind of makes sense to omit the m.net.good.sq check for v2 and keep it for v3 (since it's less safe to run v3 without TLS, but enabling TLS makes it more likely to crash with poor network quality)

@CrashOverride2
Copy link
Copy Markdown
Contributor

Problem 1
Some modems in the V3.3 modules have a bug—like mine—where the signal quality (sq) remains low and never recovers. This causes the V2 console to respond very slowly, and the V3 module to disconnect due to the incorrect m.net.good.sq reading.

Problem 2
This may be related to the OVMS firmware, which could be using an inefficient method to maintain an online connection.

For Problem 1, I’m collecting modem firmware versions from users who experience the issue where every 1–4 weeks the module goes completely offline, requiring a manual reset to come back online. I have two V3.3 modules:
One with firmware LE20B03SIM7600M21-A_CUS_JT (2022), which works fine.
Another with firmware LE20B05SIM7600G (2025), which shows the problem. Two other users who have the same problem also report this exact firmware version.

For Problem 2, since I track trips, I became curious about how often the V3 connection is lost. I ran experiments with @zbchristian’s module, based on a LilyGO T-Call V1.1 using a Simcom 7670E (LASE) modem. This modem behaves similarly to the original V3.3 modules when the modems are functioning correctly. It does not have the “stuck sq” problem, but it does suffer from GPS dropouts, where it loses its fix.

I installed external LTE and GPS antennas, which improved reception by about 15 dBm, but that didn’t solve the problem. Updating the modem firmware from B01 to B07 reduced GPS fix losses, but disconnects still occurred. After removing the m.net.good.sq check, the situation improved significantly, though occasional disconnects still happen.

For comparison, my electric scooter has a built-in modem integrated into its ECU, and its route tracking is flawless. It uses a Quectel EC25-E, leveraging its native MQTT capabilities to connect directly to the manufacturer’s servers. I’ve considered testing the same approach, since the Simcom modems offer similar functionality.

Of course, since the LilyGO module uses a different modem, it’s not a direct comparison to the original modules. However, it could serve as a starting point. In my view, the first step should be testing the “stuck sq” problem by upgrading or downgrading the modem firmware.

Track before optimizations:
before

Track after optimazations:
after

@CrashOverride2
Copy link
Copy Markdown
Contributor

Hi,

I tried to find firmware for the newer 7600 modules, but this turned out to be difficult. There are many different versions available, and I couldn’t even identify the one currently installed. Because of that, I decided not to attempt flashing my module—I don’t want to risk bricking it.

Instead, I went with my other approach: integrating native MQTT transport (MQTT via AT commands) and rewriting parts of the OVMS firmware to support this, at least for the A7670 modems. The results look promising—I can retrieve all waypoints except in areas with very poor network reception.

Implementation took longer than expected due to the asynchronous processes and blocking tasks waiting on the modem, but I managed to resolve these issues. I did encounter some crashes caused by watchdog timeouts, but I’m close to having a stable system.

This work required significant changes to ovms_cellular, ovms_server_v3, and, of course, the simcom_a7670 driver. A pull request is unlikely for now, as the code still needs optimization and cleanup.

@markwj
Copy link
Copy Markdown
Member

markwj commented Aug 15, 2025

Instead, I went with my other approach: integrating native MQTT transport (MQTT via AT commands) and rewriting parts of the OVMS firmware to support this, at least for the A7670 modems. The results look promising—I can retrieve all waypoints except in areas with very poor network reception.

I really think this approach is a non-starter. For the simple reason that it would require two separate network stacks and two separate MQTT protocol libraries (one for cellular and another for WiFi).

@CrashOverride2
Copy link
Copy Markdown
Contributor

That’s true—if native transport is used, there’s no IP connectivity at all, so no parallel V2 or push services.
In my case, this works fine because I only use the modem connection. The Wi-Fi in my parking lot is too weak, causing constant switching until the system eventually crashes. And for just a few bytes of data, I don’t see any reason why switching to home Wi-Fi would make sense. Of course, there are other use cases, such as local automation for car charging, where it could be useful.

So the main thing we need to fix is the “stuck SQ” issue. Once that’s resolved, we could remove the good-SQ check for V3/MQTT connections, which would prevent a complete lockout when only V3 is used.

Do you know of, or have, any firmware versions for the newer 7600 modems so I can try downgrading or upgrading mine?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bad network->modem.sq.good setting may prevent Server v3 connection from being initiated

4 participants