Add connection check to Trilogy#134
Conversation
This adds a very cheap connection check to Trilogy. This cheap check can be used before a connection for example is checked out from a connection pool. It allows for making the client more robust in the case of the server having disconnected (like a failover, an intermediate proxy like ProxySQL or VTGate restarted etc) but it can be reconnected safely. The implementation here matches what is also done in the Go driver, see also https://github.blog/2020-05-20-three-bugs-in-the-go-mysql-driver/ where this is decribed. It works by doing a non-blocking `recv` to peek if there's any data. If there is, or if the function would otherwise block it means the connection is still safe. If the TCP connection has been closed, `recv` immediately returns with zero bytes read. This indicates the connection is no longer safe to use. We do this directly on the file descriptor to be consistent regardless of things like TLS being used.
|
This is great and we hacked something similar on our private mysql2 fork some time ago. Would love this to ship and to be used by Rails ❤️ |
|
Hey @dbussink, I'm extremely curious about this change! Coincidentally we are dealing with an issue where applications that use trilogy are unable to recover after a proxysql restart as application keeps trying to use broken connections. We haven't found the root cause just yet but I'm eager to try this to see if it helps with our issue. I was wondering if you already have an idea of how the change in Rails may look like? Or perhaps you have a draft branch. It's okay if not, I was thinking to at least trying to use the new check as an addition (or a substitution?) to the as this is what Rails uses to I'm going to try this feature on Monday against our issue and get back with results. Thank you! |
rails/rails#49811 could be a cause (although that would probably be any Rails 7.1 app, not just ones using trilogy) |
The idea is not to use this in the same way a I don't have anything here yet, but my first idea is to add this to the connection pool logic then but it's all up for debate of course. Maybe @composerinteralia has ideas for this already though. |
|
This sounds like a valuable API to have in the library. Rails [now] already defers the verification, such that it's not done on pool checkout, but only when necessary (about to run a non-retryable query, could safely reconnect now, have not yet used the connection since it was checked out). And in that situation, I think the full So I don't think this can actually save any round-trips we'd currently be doing. I do think it could be interesting, though, as an additional check we could use when running additional queries on an already-used connection -- it would be ridiculous to re-ping before every query, but a quick pre-query peek could well save some in-flight requests from trouble. |
That's not the goal here either I think. The goal is to provide a cheap check that can be done on pool checkout.
Right, that's the whole idea here. It's cheap enough to do on pool checkout, so we can avoid errors on the case that the server restarted for connections that weren't doing mid-flight queries. That can significantly reduce the number of exceptions for example during a primary failover, or when using something like Vitess, when vtgates are restarted. |
Yes, Rails already handles that, and has for a long time. Historically we did it with a ping on checkout; we now do it by detecting the problem and reconnecting + recovering mid request. |
I think that we also have rails/rails#49811 in the mix here for us internally, which is causing also a problem for us that it's never reconnecting. With the details in that PR I can easily reproduce a broken reconnect logic in our case (if the first query after a broken connection does a query which uses
How do you think we should go about this one in Rails itself? I'm not really sure how that would be used then there, you probably know better than I do. |
|
I'm picturing that we could re-check it here, where we otherwise breeze through just trusting the previously verified [within request] connection is still good: https://github.com/rails/rails/blob/d53ef2c18073a3a662edddd3fbbce32805e1135a/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb#L1012-L1014 |
|
I had a similar idea: main...jhawthorn:trilogy:connected_p I was also checking getsockopt w/ SO_ERROR but didn't get around to testing how useful that was. So I'm in favour of having an API like this but have two thoughts I'd like to know others opinions on:
|
@jhawthorn I don't really have strong opinions on either really. The bit I prefer about the implementation here is that code lives also in the C driver bits, seems to be long a bit more there than purely in the Ruby extension code. Naming wise I'm up for changing it to whatever is preferred really.
The reason I kept exceptions here is that in case of any other unexpected errors (so it's not a regular closed connection or an open one), it seems useful to know something unexpected is broken / wrong, vs. hiding that if it's purely true / false. |
|
Thank you! Merged as #154 using the |
This adds a very cheap connection check to Trilogy. This cheap check can be used before a connection for example is checked out from a connection pool.
It allows for making the client more robust in the case of the server having disconnected (like a failover, an intermediate proxy like ProxySQL or VTGate restarted etc) but it can be reconnected safely.
The implementation here matches what is also done in the Go driver, see also https://github.blog/2020-05-20-three-bugs-in-the-go-mysql-driver/ where this is decribed.
It works by doing a non-blocking
recvto peek if there's any data. If there is, or if the function would otherwise block it means the connection is still safe.If the TCP connection has been closed,
recvimmediately returns with zero bytes read. This indicates the connection is no longer safe to use. We do this directly on the file descriptor to be consistent regardless of things like TLS being used.With this addition, I also want to propose a change then in Rails to start using this there as well on connection pool checkout.