I was looking at basically unrelated stuff and noticed this via code inspection by accident.
In expect_msgs_impl() we have:
FLOW_LOG_WARNING("struc::Channel [" << *this << "]: Registered a (one-off? = [" << one_off << "]) "
"*log-in request* expectation (union-which = [" << int(which) << "]); that raises their "
"total count to 1. Would immediately queue handler(s) to fire if "
"we had collected 1 messages matching that union-which; but we had not; however we had "
"cached 1 message with different union-which = "
"[" << int(m_rcv_pending_msgs.begin()->first) << "]. "
"So: peer expects log-in request X, while other side sent log-in request Y. "
"This might be a protocol error on the user's part on this or other side or both.");
// Subtlety: See similar comment in the on-receive handler where it also emits this error.
handle_new_error(error::Code::S_STRUCT_CHANNEL_GOT_UNEXPECTED_LOG_IN_REQUEST, "expect_msgs_impl()");
But then there is no handlers_poll() there or anywhere up the call-stack up to expect_log_in_request() (the public API being called to result in the above code path). So the error won't be emitted via the error callback. Moreover, if it were emitted, that'd be against current contract which states that callbacks can execute only in the on-active-FD path (e.g., if user loop's epoll_wait() detects an FD event of interest and tells the Channel).
Resolving this might require a little bit of thinking; what's the right thing to do? Naively we could "just" say that particular API can trigger an error, invoking the error callback. But that might be annoying and unusual. I/we should think harder before just going with it. Like, maybe it's best to emit this problem some other, less annoying way? Perhaps simply use an out-arg Error_code*. That has its own slight ugliness (no other receive-direction Channel APIs work that way), but maybe who cares.
--
Very key context:
Several factors combine to make this bug, while definitely a bug, almost impossible to really happen anytime soon. Firstly, expect_log_in_request() in the first place is not something a user really worries about. Really user channels are always today created in already-logged-in phase. This call is in practice used internally for the session-master channel in our ipc::session internals, when setting up the session out of which user channels actually appear.
(That said, the log-in API is formally publicly available. It is conceivable a user would use it for their own needs somehow. It's just not likely. Probably they'll just set up a session - where we do all this internally - and then have nice easy already-logged-in, safe channels.)
Beyond that, the actual error happens if (1) a would-be log-in message arrives; (2) then the user registers which log-in message they expect; (3) and it's the wrong kind of message. Presumably the user of the log-in phase feature knows which message is the log-in message. Like in particular this error won't happen with our internal session-master-channel, where this feature is in practice used today.
So priority-wise... not high.
It's a bit of an ugly thing though. Should get rid of it.
--
In addition:
- Look into how it affects the async-I/O-pattern wrapper struc::Channel (the above is in the core, struc::sync_io::Channel).
- I have already done so, and only found the above problem; but just in case: Do go over all sources of error (handle_new_error() calls) and ensure the error callback is properly fired per contract (and the contract is reasonable).
- Don't forget to correct (as needed) the expect_log_in_request()-using internal ipc::session code, if there's a new way an error can be triggered involving it. (Note: Since we so tightly control our own session-master-channel code, it might not be bad to just assert() that this goes fine - if it doesn't it's just an internal bug, so why emit it to user? We wouldn't on an internal buggy null pointer dereference or something.)
I was looking at basically unrelated stuff and noticed this via code inspection by accident.
In expect_msgs_impl() we have:
But then there is no handlers_poll() there or anywhere up the call-stack up to expect_log_in_request() (the public API being called to result in the above code path). So the error won't be emitted via the error callback. Moreover, if it were emitted, that'd be against current contract which states that callbacks can execute only in the on-active-FD path (e.g., if user loop's epoll_wait() detects an FD event of interest and tells the Channel).
Resolving this might require a little bit of thinking; what's the right thing to do? Naively we could "just" say that particular API can trigger an error, invoking the error callback. But that might be annoying and unusual. I/we should think harder before just going with it. Like, maybe it's best to emit this problem some other, less annoying way? Perhaps simply use an out-arg Error_code*. That has its own slight ugliness (no other receive-direction Channel APIs work that way), but maybe who cares.
--
Very key context:
Several factors combine to make this bug, while definitely a bug, almost impossible to really happen anytime soon. Firstly, expect_log_in_request() in the first place is not something a user really worries about. Really user channels are always today created in already-logged-in phase. This call is in practice used internally for the session-master channel in our ipc::session internals, when setting up the session out of which user channels actually appear.
(That said, the log-in API is formally publicly available. It is conceivable a user would use it for their own needs somehow. It's just not likely. Probably they'll just set up a session - where we do all this internally - and then have nice easy already-logged-in, safe channels.)
Beyond that, the actual error happens if (1) a would-be log-in message arrives; (2) then the user registers which log-in message they expect; (3) and it's the wrong kind of message. Presumably the user of the log-in phase feature knows which message is the log-in message. Like in particular this error won't happen with our internal session-master-channel, where this feature is in practice used today.
So priority-wise... not high.
It's a bit of an ugly thing though. Should get rid of it.
--
In addition: