Commit d4fdd92
authored
feat(alertrouter): B.3b — system-alert-router implementation (15/15 ACs, 100%) (#424)
Continues Slice B.3 (orchestration plumbing). The alert router is the
bridge between OpenWatch's event bus (B.3a) and external notification
channels — Slack, email, webhook, PagerDuty. It subscribes to bus
events at boot, translates each into a typed Alert, applies a dedup
gate per (alert_type, host_id, rule_id) tuple, and dispatches matching
alerts to registered Channels.
Concrete channel implementations live in subpackages so the core
router has no external SDK dependencies. This PR ships the interface +
a test fake; Slack, email, webhook implementations land in follow-ups.
Spec
New: app/specs/system/alert-router.spec.yaml (status: approved).
15 ACs across 9 constraints.
internal/alertrouter package
doc.go Architectural choices: bus subscription on Start, closed
AlertType + Severity enums, in-memory dedup with TTL,
channel registration with tag-filter routing, per-
channel goroutine with failure isolation, Stop drains
with 10s timeout.
types.go AlertType closed enum (HostUnreachable, HostRecovered,
DriftMajor, DriftMinor, DriftImprovement).
Severity closed enum (Critical, High, Medium, Low,
Info) + SeverityOrder rank map.
Alert struct with Type, Severity, HostID, RuleID, Tags.
Channel interface (Name + Send).
ChannelRegistration with Tags filter; empty Tags =
wildcard.
ValidateDedupTTL enforces [60s, 24h] range with typed
ErrDedupTTLOutOfRange.
dedup.go DedupGate keyed by Alert.DedupKey(); in-memory map with
opportunistic reap on every ShouldSkip call to keep
size bounded under churn. Injectable now() for testing.
router.go Router with Start (subscribe to HeartbeatPulse +
DriftDetected) / Stop (unsubscribe + drain
in-flight Channel.Send with 10s timeout).
Per-channel goroutine dispatch; one channel's error or
panic does NOT block delivery to other channels.
Event translation: HeartbeatPulse{Reachable=false}
→ host_unreachable (High); recovery → host_recovered
(Info); DriftDetected{major} → drift_major (High);
minor → drift_minor (Medium); improvement → Info.
metrics.go ReceivedCount + RoutedCount + DedupedCount +
ChannelFailureCount with JSON-friendly Snapshot.
Tests (15/15 ACs, all under -race)
types_test.go AC-01 enum closure (5 alert types).
AC-02 severity enum + SeverityOrder ranking.
AC-15 ValidateDedupTTL range check (boundary
cases + typed error sentinel).
router_test.go AC-03/04/05 event-to-alert translation.
AC-06 dedup skip within TTL (Channel.Send NOT
called for the skipped alert).
AC-07 dedup pass after TTL (fake clock on gate).
AC-08 tag-filter rejects non-match.
AC-09 wildcard channel (empty Tags) receives
every alert.
AC-10 channel error doesn't block other channels
(per-channel + aggregate failure counters).
AC-11 Start subscribes to BOTH event kinds.
AC-12 Stop drains slow sends + post-Stop publishes
ignored.
AC-14 all four metric counters increment under
compound scenarios.
source_test.go AC-13 internal/alertrouter (core, not subpackages)
imports no external notification SDKs
(slack-go, sendgrid, mailgun, twilio,
PagerDuty SDK, opsgenie, gomail, etc.).
AST-based import scan.
Verification
go vet ./... clean
golangci-lint clean
govulncheck clean
go test -race -count=1 PASS (1.10s)
specter parse PASS (system-alert-router@1.0.0)
specter check PASS (32 specs)
specter coverage system-alert-router 15/15 (100%)
Spec: app/specs/system/alert-router.spec.yaml1 parent b095781 commit d4fdd92
9 files changed
Lines changed: 1584 additions & 0 deletions
File tree
- app
- internal/alertrouter
- specs/system
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
0 commit comments