compute-domain-daemon: skip SIGUSR1 when DNS mapping update only removes nodes#919
Conversation
Signed-off-by: AkshatDudeja77 <akshat.dudeja77@gmail.com>
a620c5d to
f8ad4c0
Compare
That's a great goal. But how do we detect this situation with sets being our central data structures? Write two sets down, on a piece of paper. This could look like this: What do you need to do in order to find out if set A has an item removed compared to set B? Do you need to look at more than just set size? I genuinely wonder what your goals and incentives are. This is I think your sixth pull request of this style against this repository, and I believe we need to think about providing candid feedback soon :). These are wild times! |
jgehrcke
left a comment
There was a problem hiding this comment.
logically incorrect, as expressed in regular comment
|
For context, I still find it quite likely that this is AI-generated noise. Last time I looked into this, I noted: |
|
@jgehrcke thank you for your point about comparing sets rather than just sizes, I finally do understand what you're saying. My intention with the current change was only to avoid sending SIGUSR1 when the update strictly removes nodes (i.e. when the mapping shrinks). In cases where For transparency: I used AI to draft the initial idea here, but I’ve reviewed and adjusted the code and tests myself. |
f8ad4c0 to
a7ddddf
Compare
|
@AkshatDudeja77 Can you write some unit testcases around it |
13461a6 to
e40c7ed
Compare
@rajatchopra Refactored the SIGUSR1 decision into a helper and added unit tests covering no-change, pure-removal, additions, replacements, remove+add, and fresh-process cases. This now explicitly checks for newly added peers rather than relying on size or subset assumptions. Please let me know if there are additional edge cases you'd like covered |
| // - the mapping was updated, | ||
| // - the process is not fresh, and | ||
| // - at least one new IP (peer) was added. | ||
| func shouldSendSIGUSR1(oldIPs, newIPs IPSet, updated, fresh bool) bool { |
There was a problem hiding this comment.
Do we still need updated as an argument into this function? Is it directly implied by oldIPs and newIPs?
There was a problem hiding this comment.
Good point, updated is handled at the call site now, so i removed it from the helper and aligned the tests accordingly
There was a problem hiding this comment.
Updated the docstring to reflect the current function behavior after removing updated.
Also fixed the IPSet usage at the call site by explicitly constructing the new IP set from the DNS mapping to ensure correct types
|
Unknown CLA label state. Rechecking for CLA labels. Send feedback to sig-contributor-experience at kubernetes/community. /check-cla |
rajatchopra
left a comment
There was a problem hiding this comment.
/lgtm
@AkshatDudeja77 please squash the commits
|
LGTM label has been added. DetailsGit tree hash: be76cc6eec49013d0f97eb3c7c4c8180775a0392 |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: AkshatDudeja77, rajatchopra The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Signed-off-by: AkshatDudeja77 <akshat.dudeja77@gmail.com>
418084a to
65cdb23
Compare
❌ Deploy Preview for dra-driver-nvidia-gpu failed. Why did it fail? →
|
|
@rajatchopra squashed commits as requested, thanks for the review |
|
@dims please take a look for approval when you get the chance |
|
@jgehrcke can you please chime in? |
Fixes #913
The IMEXDaemonUpdateLoopWithDNSNames currently sends SIGUSR1 to the IMEX
daemon whenever the DNS/IP mapping changes.
However, when the update only removes nodes (and does not add new ones),
forcing the daemon to re-resolve and reconnect is unnecessary.
This change tracks the previous mapping size and skips sending SIGUSR1 when
the updated mapping strictly removes nodes.
Before:
SIGUSR1 triggered whenever the mapping changed.
After:
SIGUSR1 triggered only when nodes are added or mappings otherwise change,
but not when nodes are only removed.
This implements the TODO in the code suggesting that reconnects can be
skipped when the new IP set only removes addresses compared to the old set.