Commit 3799dc0
authored
feat: add publish resilience gate for transient RabbitMQ connection drops (#77)
* fix: add publish resilience gate for connection recovery
During transient RabbitMQ connection drops, PublishAsync now suspends
rather than failing immediately. An AsyncManualResetEvent gate blocks
publishes while the connection is recovering and resumes them
automatically once recovery succeeds.
Key behaviors:
- Gate closes on non-application connection shutdown
- Gate opens on successful recovery
- Gate stays closed on recovery error (auto-recovery keeps retrying)
- Gate is signaled during disposal for fast shutdown
- Configurable PublishRecoveryTimeout (default 10s)
- TimeoutException wrapped in MessageBusException for consistent API
Closes #76
* fix: address CI failure and PR review feedback
- Fix CI: SimulatePublisherConnectionShutdownAsync now nulls _publisherChannel
to properly represent a real disconnect (test was passing because channel
stayed alive when gate was disabled)
- Fix CTS disposal: all CancellationTokenSource instances in tests now use
'using' declarations
- Fix timeout message: use TotalMilliseconds instead of TotalSeconds to
avoid misleading "0s" for sub-second timeouts
- Add SimulatePublisherConnectionRecoveryErrorAsync for proper recovery
error testing
- Rename test to PublishAsync_RecoveryErrorDoesNotOpenGate_WaitsUntilTimeout
and add actual recovery error simulation call
- Add input validation: PublishRecoveryTimeout rejects negative values
* refactor: move lock inside retry loop for channel re-read on each attempt
Applies industry best practice (NServiceBus/EasyNetQ pattern): re-read
_publisherChannel on each retry attempt rather than capturing once.
Changes:
- Lock acquired/released per retry (not held across all retries)
- IsOpen check rejects dead channels before attempting publish
- AlreadyClosedException from BasicPublishAsync triggers retry with
fresh channel state
- Other publishers and recovery handlers can proceed between retries
* fix: align test assertion with updated error message
The IsOpen check changed the message from "publisher channel was closed"
to "publisher channel is closed or unavailable" -- update assertion to match.
* refactor: move recovery gate inside retry loop, fix disposal and duplicate checks
- Gate check now runs on EACH retry attempt, preventing retries from
burning through the budget instantly when connection flaps
- Remove _publisherReady.Set() from CleanupAsync; the base class
DisposedCancellationToken already cancels in-flight publishes before
CleanupAsync runs, so manual gate-open is unnecessary and caused
publishers to unblock into a closed channel
- Remove redundant _isPublisherBlocked check from PublishImplAsync;
the authoritative check inside the lock in PublishMessageAsync is
the only one that matters (outer check was stale/racy)
- Update disposal test to assert OperationCanceledException (correct
behavior when disposal cancellation unblocks the gate wait)
* docs: clarify PublishRecoveryTimeout is per-attempt, increase test CTS margin
- Update XML doc comments to explicitly state the timeout is per-attempt
(resilience policy may retry, so total wall-clock can exceed the value)
- Increase CTS timeout in recovery-error test from 5s to 10s to prevent
CI flakiness (3 retry attempts x 500ms + backoff = ~4.5s)
* Address PR feedback: fix log message, add AnyContext, harden disposal test
- Branch log message for timeout=0 vs timeout>0 in OnPublisherConnectionOnConnectionShutdownAsync
- Add .AnyContext() to SimulatePublisherConnectionShutdownAsync
- Wrap disposal test in try/finally for leak safety
* fix: restore recovery gate reset on disposal, remove stale XML docs
- Restore _publisherReady.Set() in CleanupAsync to unblock publishers waiting on the recovery gate during disposal (removed in 00fca54). Without this, publishers with long PublishRecoveryTimeout values rely solely on DisposedCancellationToken propagation which is fragile.
- Remove stale <remarks> on PublishImplAsync advising channel-per-thread patterns that don't apply (single channel + AsyncLock architecture).
- Document blocked-state fail-fast design decision inline.
* fix: address review findings -- security, correctness, and code quality
- Sanitize connection string in constructor exceptions to avoid leaking
credentials (strip userinfo, show only scheme/host/port/path)
- Guard DeliveryDelay overflow: fail with clear ArgumentOutOfRangeException
instead of unhelpful OverflowException when delay > Int32.MaxValue ms
- Handle IPv6 bracket notation in ParseHostEndpoint ([::1]:port)
- Extract duplicated publisher channel creation into CreatePublisherChannelAsync
- Remove stale empty XML doc tags on PublishImplAsync and CreateConnectionAsync
- Add PERF comments at ToArray() call sites and publish lock documenting
allocation/serialization trade-offs (tracked in FoundatioFx/Foundatio#512)
* fix: simplify CreatePublisherChannelAsync and SanitizeUri per PR feedback
- Remove async/await from CreatePublisherChannelAsync and return the Task
directly since there is no code after the await and no using scope to clean up
- Remove the SanitizeUri(string) overload; the only call site where URI parse
failed cannot safely echo the value, so the message no longer includes it
- URI sanitization is still applied at the scheme-check exception via the Uri overload
* fix: volatile channel fields, disposal race, and ParseHostEndpoint fallback
- Add volatile to _publisherChannel/_subscriberChannel for safe
double-checked locking on ARM64 (outer null check needs memory barrier)
- Move _publisherReady.Set() after connection close in CleanupAsync to
prevent unblocked publishers from racing into a disposing channel
- Fix ParseHostEndpoint: invalid port fallback now uses parsed hostname
instead of the full trimmed string (e.g. "host:abc" -> "host", not
"host:abc")
* fix: update stale x-delay comment to reflect current Int32 cast
* fix: handle unbracketed IPv6 in ParseHostEndpoint1 parent 44baa63 commit 3799dc0
4 files changed
Lines changed: 345 additions & 63 deletions
File tree
- src/Foundatio.RabbitMQ
- Messaging
- Properties
- tests/Foundatio.RabbitMQ.Tests/Messaging
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
28 | | - | |
29 | | - | |
| 29 | + | |
| 30 | + | |
30 | 31 | | |
31 | 32 | | |
32 | 33 | | |
| |||
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
42 | | - | |
| 43 | + | |
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
46 | | - | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
96 | 97 | | |
97 | 98 | | |
98 | 99 | | |
| 100 | + | |
| 101 | + | |
99 | 102 | | |
100 | 103 | | |
101 | 104 | | |
| |||
355 | 358 | | |
356 | 359 | | |
357 | 360 | | |
| 361 | + | |
358 | 362 | | |
359 | 363 | | |
360 | 364 | | |
| |||
402 | 406 | | |
403 | 407 | | |
404 | 408 | | |
405 | | - | |
| 409 | + | |
406 | 410 | | |
407 | | - | |
408 | | - | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
409 | 428 | | |
410 | | - | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
411 | 432 | | |
412 | | - | |
413 | | - | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
414 | 438 | | |
415 | | - | |
| 439 | + | |
| 440 | + | |
416 | 441 | | |
417 | 442 | | |
418 | | - | |
419 | | - | |
| 443 | + | |
| 444 | + | |
420 | 445 | | |
421 | 446 | | |
422 | 447 | | |
423 | 448 | | |
| 449 | + | |
424 | 450 | | |
425 | 451 | | |
426 | 452 | | |
| |||
462 | 488 | | |
463 | 489 | | |
464 | 490 | | |
465 | | - | |
466 | | - | |
467 | | - | |
468 | | - | |
469 | | - | |
470 | | - | |
471 | | - | |
472 | | - | |
473 | | - | |
474 | | - | |
475 | | - | |
| 491 | + | |
476 | 492 | | |
477 | 493 | | |
478 | 494 | | |
| |||
500 | 516 | | |
501 | 517 | | |
502 | 518 | | |
503 | | - | |
504 | | - | |
505 | | - | |
506 | | - | |
507 | | - | |
508 | | - | |
509 | | - | |
510 | | - | |
511 | | - | |
512 | | - | |
513 | | - | |
| 519 | + | |
514 | 520 | | |
515 | 521 | | |
516 | 522 | | |
| |||
534 | 540 | | |
535 | 541 | | |
536 | 542 | | |
537 | | - | |
| 543 | + | |
538 | 544 | | |
539 | 545 | | |
540 | 546 | | |
541 | 547 | | |
542 | 548 | | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
543 | 560 | | |
544 | 561 | | |
545 | 562 | | |
| |||
560 | 577 | | |
561 | 578 | | |
562 | 579 | | |
563 | | - | |
564 | 580 | | |
565 | 581 | | |
| 582 | + | |
566 | 583 | | |
567 | 584 | | |
568 | 585 | | |
569 | 586 | | |
570 | | - | |
571 | | - | |
572 | | - | |
573 | | - | |
574 | | - | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | 587 | | |
582 | 588 | | |
583 | | - | |
584 | | - | |
585 | | - | |
586 | 589 | | |
587 | 590 | | |
588 | 591 | | |
| |||
619 | 622 | | |
620 | 623 | | |
621 | 624 | | |
622 | | - | |
623 | | - | |
624 | | - | |
| 625 | + | |
| 626 | + | |
625 | 627 | | |
626 | | - | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
627 | 632 | | |
628 | 633 | | |
629 | 634 | | |
| |||
635 | 640 | | |
636 | 641 | | |
637 | 642 | | |
638 | | - | |
639 | | - | |
640 | | - | |
641 | | - | |
642 | 643 | | |
643 | 644 | | |
644 | | - | |
645 | 645 | | |
646 | 646 | | |
647 | 647 | | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
648 | 664 | | |
649 | 665 | | |
650 | 666 | | |
| |||
805 | 821 | | |
806 | 822 | | |
807 | 823 | | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
808 | 833 | | |
809 | 834 | | |
810 | 835 | | |
| |||
814 | 839 | | |
815 | 840 | | |
816 | 841 | | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
817 | 861 | | |
818 | 862 | | |
819 | 863 | | |
820 | 864 | | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
821 | 869 | | |
822 | | - | |
823 | | - | |
824 | | - | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
825 | 873 | | |
826 | 874 | | |
827 | 875 | | |
| |||
897 | 945 | | |
898 | 946 | | |
899 | 947 | | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
900 | 988 | | |
0 commit comments